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ABSTRACT 

Using SDSS Data Release 6, we construct two independent samples of candidate stellar wide binaries 
selected as i) pairs of unresolved sources with angular separation in the range 3" — 16", ii) common 
proper motion pairs with 5" — 30" angular separation, and make them publicly available. These 
samples are dominated by disk stars, and we use them to constrain the shape of the main-sequence 
photometric parallax relation M r (r — i), and to study the properties of wide binary systems. We 
estimate M r (r — i) by searching for a relation that minimizes the difference between distance moduli 
of primary and secondary components of wide binary candidates. We model M r (r — i) by a fourth 
degree polynomial and determine the coefficients using Markov Chain Monte Carlo fitting, indepen- 
dently for each sample. Both samples yield similar relations, with the largest systematic difference of 
0.25 mag for F0 to M5 stars, and a root-mean-square scatter of 0.13 mag. A similar level of agree- 
ment is obtained with photometric parallax relations recently proposed bv lJuric et alJ ((2008) . The 
measurements show a root-mean-square scatter of ~ 0.30 mag around the best fit M r (r — i) relation, 
and a mildly non-Gaussian distribution. We attribute this scatter to metallicity effects and additional 
unresolved multiplicity of wide binary components. Aided by the derived photometric parallax rela- 
tion, we construct a series of high-quality catalogs of candidate main-sequence binary stars. These 
range from a sample of ~ 17, 000 candidates with the probability of each pair to be a physical binary 
(the "efficiency") of ~ 65%, to a volume-limited sample of ~ 1,800 candidates with an efficiency 
of ~ 90%. Using these catalogs, we study the distribution of semi-major axes of wide binaries, a, 
in the 2, 000 < a < 47, 000 AU range. We find the observations to be well described by the Opik 
distribution, f(a) oc l/o, for a < dbreak, where a^eak increases roughly linearly with the height Z 
above the Galactic plane (abreak oc 12, 300 Z[kpc] ' 7 AU). The number of wide binary systems with 
100 AU < a < abreak, as a fraction of the total number of stars, decreases from 0.9% at Z = 0.5 kpc 
to 0.5% at Z — 3 kpc. The probability for a star to be in a wide binary system is independent of 
its color. Given this color, the companions of red components seem to be drawn randomly from the 
stellar luminosity function, while blue components have a larger blue-to-red companion ratio than 
expected from luminosity function. 

Subject headings: binaries: visual — stars: distances — Hertzsprung-Russell diagram 



1. INTRODUCTION 

Binary systems can be roughly divided into close (semi- 
major axes a < 10 AU) an d wide (semi-major axes 
a > 100 AU, IChanaml I2007D pairs. Close binary sys- 
tems have long been recognized as useful tools for stud- 
ies of stellar properties. For example, the stellar param- 
eters such as the masses and radii of individual stars 
are readily determined to h igh confidence using eclips- 
ing binaries (Andersen 1991). Wide binary systems have 
proven to be a tool for studies of star formation pro- 
cesses, as well as an exceptionally useful tracer of local 
potential and tidal fields through which they traverse. 
Specifically, they were used to place the constraints on 
the n ature of halo dark matter (|Yoo. Chaname. fc Gouldl 
120041) and to explore the dynamical history of t he Galaxy 
( Allen. Poveda. fc Hernandez- Alcantara! 120071 ) . A fur- 
ther comprehensive list of current appli cations of wide 
binaries can be found in IChana mc ( 2003). 

Close binaries, owing to their relatively short or- 
bital periods and equally short timescales of bright- 
ness or spectrum fluctuations, are fairly easy to de- 
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tcct. Unambiguous identification of wide binary sys- 
tems, on the other hand, requires accurate astrome- 
try on much longer timescales, as these systems have 
orbital periods > 10,000 years. However, instead of 
requiring unambiguous identification, large samples of 
candidate wide binaries can be selected by simply as- 
suming that pairs of stars with sm all angular separa- 
tion are also gravitationally bound (jBahcall fc Soneiral 
fT98lt iGouldl Tl995h . or by searching for common 



proper motion pairs (iLuytenl Il979t iPoveda et alJ 



Allen, Poveda. & Hcrrcra 2000; Goul d fe SalimT 



1994 



2003 



Chaname fc Gould 12004 [Lcpine & Bongiorn^ l2007h . 



The angular separation method is simple to apply, but 
it also introduces a relatively large number of false can- 
didates due to chance association of nearby pairs. The 
contamination by random associations can be reduced by 
imposing constraints, such as the common proper mo- 
tion, or by requiring that the stars are at similar dis- 
tances. The distances can be inferred through a variety 
of means, one of which is the use of an appropriate pho- 
tometric parallax 3 relation. 

The photometric parallax relation provides the abso- 
lute magnitude of a star given that star's color and metal- 
licity. There are a number of proposed photometric par- 

3 Also known as "color-luminosity relation" . 
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allax relations for main sequence stars in the literature 
that differ in the methodology used to derive them, pho- 
tometric systems, and the absolute magnitude and metal- 
licity range in which they are applicable. Not all of them 
are mutually consistent, and most exhibit significant in- 
trinsic scat ter of order a hal f a magnitude or more (see 
Figure 2 in lJuric et aHl2008l hereafter J08). 

Instead of using an existing relation to select wide bi- 
naries, we propose a novel method that simultaneously 
derives the photometric parallax relation and selects a 
sample of wide binary candidates. The method relies on 
the fact that components of a physical binary have equal 
distance moduli (mi — Mi = m 2 — Mi) and therefore 
S = AM- Am = (M 2 -Mi)-(m 2 -mi) = 0. Assuming 
that both stars are on the main sequence, and the shape 
of the adopted photometric parallax relation is correct, 
the difference in absolute magnitudes AM — M 2 — Mi 
calculated from the parallax relation must equal the mea- 
sured difference of apparent magnitudes, Am = m 2 — m\. 
The AM = Am equality for binaries must be valid irre- 
spective of color, and therefore represents a test of the 
validity of the adopted photometric parallax relation or, 
alternatively, a way to estimate the parallax relation. 

In practice, the distribution of <5 will not be a delta- 
function both due to instrumental (finite photometric 
precision) and physical effects (true vs. apparent pairs). 
However, for true wide binaries, the distribution of S is 
expected to be narrow, strongly peaked at zero, and the 
individual S values are expected to be uncorrelated with 
color. In contrast, the distribution of 6 values for ran- 
domly associated stellar pairs (hereafter random pairs) 
should be much broader even when the correct photo- 
metric parallax relation is adopted, reflecting the differ- 
ent distances of components of projected binary pairs. 
This dichotomy can be used to assign a probability to 
each candidate, of whether it is a true physical binary or 
a result of chance projection on the sky. 

The paper is organized as follows. In Section [21 we 
give an overview of the SDSS imaging data, and describe 
the selection, completeness and population composition 
of two initial, independent samples of candidate binaries. 
In Section[3]we describe the photometric parallax estima- 
tion method, compare the best-fit photometric parallax 
relations to the J08 relation, and analyze the scatter in 
predicted absolute magnitudes. The properties of wide 
binaries, such as the color and spatial distributions, are 
analyzed in Section [H Finally, the results and their im- 
plications for future surveys are discussed in Section [5l 

2. THE DATA 

2.1. Overview of the SDSS Imaging Data 

Thanks to the quality of its photometry and astrom- 
etry, as well as the large sky coverage, the SDSS stands 
out among available optical sky surveys. The SDSS 
provides homogeneous and deep (r < 22.5 ) photome- 



try i n five bandpasses (it, g, r, i, and z , 



19981 iHogg et all 



2006; Tu cker et al 



Gun n et al 
Gunn et al 



2002t : iSmith etail l2003 „ . 
2006) accurate to 0.02 mag (rms scat 



ter) for unresolved sources not limited by ph oton statis- 
tics ([Scranton et alj l2002t llvezic et al.1l2003l). and wit h 
a zeropoint uncertainty of 0.02 mag ([Ivezic et al.l 12004). 
The survey sky coverage of 10,000 deg 2 in the north- 
ern Galactic cap and 300 deg 2 in the southern Galac- 
tic cap results in photometric measurements for well 



over 100 million stars and a similar number of galax- 
ies (IStoughton et "all I2002T). Th e recent Data Release 6 
(|Adelman-McCarthv et aLll2008D lists 4 photometric data 
for 287 million unique objects observed in 9583 deg 2 of 
sky, and can be accessed through the Catalog Archive 
Server 5 (CAS) CasJobs 6 interface. Astrometric positions 
are accurate to better t han 0.1" per coor dinate (rms) for 
sources with r < 20.5 (|Pier et all [20031. and the mor- 
phological information from the i mages allows reliab le 
star-galaxy separation to r ~ 21.5 (|Lupton et al.l [2002). 

The five-band SDSS photometry can be used for 
very detailed source classification, e.g., separation 
of quasars and stars (Ri chards et al.l 120021 ). spectral 
classification o f stars to withi n one to two spec- 
tral subtypes (iLenz et all Il998t iFinlator et alj 120001 : 
lHawlev et ail l2002t ICovev et all 120071 ), identi fication of 
horiz o ntal-branch and RR Lyrae stars (lYannv et al 



2000t ISirko et all 12004 llvezic et all 120051: ISesar et al 



2007D . and low-metallicity G and K giants (|Helmi et al 



2001 



Proper motion data exist f or SDSS sou r ces m atched 
to the USNO-B1.0 catalog ([Monet et alj 120031) We 
take p roper motion measurements from the iMunn et al.l 
(|2004f ) catalog based on astrometric measurements from 
th e SDSS and Palomar Ob servatory S ky Surveys (POSS - 
I; iMinowski fc~Abel fl96l POSS-II; iReid et alJ \l99)h . 
Despite the sizable random and systematic astrometric 
errors in the Schmidt surveys, the combination of a long 
baseline (50 years for the POSS-I survey) and a recali- 
bration of the photographic data using positions of SDSS 
galaxies, results in median random errors for proper mo- 
tions of only 3 mas yr _1 for r < 19.5 (per coordinate), 
with substantially smaller systematic errors unn et alJ 
2004). Following a recommendation by Munn et al., 
when using their catalog we select SDSS stars with only 
one USNO-B match within 1", and require proper mo- 
tion rms fit residuals to be less than 350 mas in both 
coordinates. We note that the proper motion measure- 
ments publicly available as a part of SDSS Data Release 
6 are known to have significant systematic errors (Munn 
et al., in prep.). Here we use a revised set of proper mo- 
tion measurements which will become publicly available 
as a part of SDSS Data Release 7. 

2.2. The Initial Sample of Close Resolved Stellar Pairs 

For o bjects in the SPSS catalog, the photometric 
pipeline (|Lupton et al.ll2002l ) sets a number of flags that 
indicate the status of each object, warn of possible prob- 
lems with the image itself, and warn of possible problems 
in the measurement of various quantities associated with 
the object. These flags can be used to remove duplicate 
detections (in software) of the same object, and to select 
samples of unresolved sources with good photometry. 

According to the SDSS Catalog Archive Server "Al- 
gorithms" webpage 7 , duplicate detections of the same 
objects can be removed by considering only those which 
have the "status" flag set to PRIMARY. We consider 
only PRIMARY objects, and select those with good pho- 
tometry by requiring that the BINNED 1 flag is set to 1, 

4 See [HREF] http://www.sdss.org/dr6 

5 [ThREF]http://cas. sdss.org 

6 [HREF]http://casjobs. sdss.org/CasJobs/ 

7 [HREFJhttp: / / cas.sdss.org/ dr6 / en /help / docs / algorithm. asp?key=fiags 
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and PSF.FLUXJNTERP, DEBLEND.NOPEAK, 
INTERP .CENTER, BAD.COUNTS.ERROR, 
NOTCHECKED, NOPROFILE, PEAKCENTER, 
and EDGE image processing flags are set to in the 
gri bands. The moving unresolved sources, such as 
asteroids, are avoided by selecting sources with the 
DEBLENDED_AS_MOVING flag set to 0. 

Good photometric accuracy ( mean PSF magnit ude er- 
rors < 0.03 mag, see Figure 1 in lSesar et alJ l2007) is ob- 
tained by selecting sources with 14 < r' < 20.5, where r' 
is the r band PSF magnitude uncorrected for ISM extinc- 
tion. The PSF magnitudes corrected for ISM extin ction 
(using maps bv lSchlegel. Finkbeiner fc Davislfl99l . and 
used throughout this work, are noted as u, g, r, i, and z. 

To create the initial sample of resolved stellar pairs, 
we query 8 the CAS "Neighbors" table (lists all SDSS 
pairs within 30") for pairs of sources that pass the above 
criteria, and that have 

(n-r 2 )[(g-i)i-(g-i)2}>0, (1) 

where the subscript 1 is hereafter assigned to the brighter 
component. With this condition we require that the com- 
ponent with bluer g — i color is brighter in the r band. 
About 40% of random pairs are rejected with this condi- 
tion. We estimate that about 3% of true binary systems 
might be excluded by this cut (due to uncertainties in 
the g — i color caused by photometric errors), but their 
exclusion does not significantly influence our results. 

We select ~ 4.2 million pairs for the initial sample of 
resolved stellar pairs, and plot the observed distribution 
of angular separation 6, f bs(9), in Figure Q] (top). For 
a uniform (random) distribution of stars, the number of 
neighboring stars within an annulus A9 increases linearly 
with 9, and therefore, the number of random pairs also 
increases with 9. To find the number of random pairs as 
a function of 9, we fit f rn d(Q) — C6 to the f b s histogram 
(in the 9 > 15" region), and find C = 9043 arcsec" 1 . For 
large separation angles (9 > 15") the two distributions 
closely match, indicating that the majority of observed 
pairs are simply random associations, and are not phys- 
ically related. At separation angles smaller than ~ 15" 
the frequency of observed pairs shows an excess, suggest- 
ing the presence of true, gravitationally bound systems. 
However, even at small separation angles, the selected 
pairs include a non-negligible fraction of random pairs 
and require further refinement, or careful statistical ac- 
counting for random contamination. 

Throughout this work we use samples of random pairs 
(random samples, hereafter) to account for random con- 
tamination in candidate binaries. We define the random 
sample as a sample of pairs with 20" < 9 < 30" taken 
from the initial pool of stellar pairs. Since pairs in the 
random sample pass the same data quality selection as 
candidate binaries, and since virtually all of them are 
chance associations (99.75%; see Section 14.11 and Fig- 
ure [T]) , the random sample is a fair representation of the 
population of randomly associated stars in candidate bi- 
nary samples. 

2.3. The Geometric Selection 

The excess of pairs with 9 < 15" in Figure [T]( top) likely 
indicates a presence of true binaries, and the angular 

8 SQL queries are listed in Appendix lAl 



separation provides a simple, geometric criterion to select 
candidate binary systems. This excess, shown as the 
ratio fobs/fmd hi Figure [1] (bottom), increases for 9 < 
15", reaches a relatively flat peak of ~ 1.45 for 3" < 9 < 
4", and sharply decreases for 9 < 2" due to finite seeing 
and inability to resolve close pairs of sources. This excess 
is related to the fraction of true binaries, e(9), as 

e(6) = l-frnd(0)/fobs(0). (2) 
Using Figure Q] ( bottom) , we choose 3" < 9 < 4" for our 
geometric selection criterion, since the fraction of true 
binaries is expected to reach a maximum of ~ 35% in 
this range. 

The interpretation of the excess of close stellar pairs 
as gravitationally bound binary pairs implies that the 
components are at similar distances. If this is true, and if 
it is possible to constrain the distance via a photometric 
parallax relation, than their distribution in the color- 
magnitude diagram should be different than for a sample 
of randomly associated stars. 

To test this hypothesis, we select 51,753 candidate bi- 
naries with 3" < 9 < 4". We compare their distribution 
in the Ar = T2 — T\ vs. A(g — i) = (g — 1)2 — (g — i)i 
diagram to the distribution of pairs from the random 
sample, as shown in Figure [2l The number of pairs in 
this random sample is restricted to 51,753. Were the se- 
lection a random process, the selected candidates would 
have the same distribution in this diagram as the ran- 
dom sample, and the average candidate-to-random ratio 
would be ~ 1. However, in the region where 

4.33A(.g-i)-Ar + 0.4>0, (3) 

and 

2.3lA(.g - i) - Ar - 0.46 < (4) 

the two distributions arc different (average candidate-to- 
random ratio of ~ 1.7), implying that > 40% of can- 
didates are found at similar distances. In principle, a 
selection cut using Equations [3] and [4] could be made to 
increase the fraction of true binaries in the candidate 
sample. We do not make such a cut a priori, but instead 
develop a method (described in Section [3]) that robustly 
"ignores" random pairs while estimating the photometric 
parallax relation. After a best-fit photometric parallax 
relation is obtained, the contamination can be minimized 
by selecting only pairs where both components are at 
similar distances, as described in Section [4~T1 

The r vs. g — i distributions of brighter and fainter 
components of candidate binaries are shown in Figure [3l 
We find that the brighter components in the candidate 
sample are mostly disk G to M dwarfs, while the fainter 
components are mostly M dwarfs. 

2.4. The Kinematic Selection 

As seen from Figure Q] (top), candidate binaries with 
9 > 15" cannot be efficiently selected using angular dis- 
tance only, as nearly all pairs in this range are most likely 
chance associations. In this regime, a kinematic selection 
based on common proper motion should be more effi- 
cient, as random pairs have a small probability (~ 0.005 
determined using Monte Carlo simulations) to be com- 
mon proper motion pairs (using selection criteria listed 
below). 

We therefore select a second sample of 14,148 candi- 
date binaries by searching for common proper motion 
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pairs with proper motion difference A/j, = |/^2 — < 5 
mas yr , and with absolute proper motion in the range 
15 mas yr' 1 < \jj\ max < 400 mas yr' 1 , where \fJ,\ ma x = 
max(\fi±\, | /Li2 1 ) . These criteria require that the directions 
of two proper motion vectors agree at a la level, and that 
the proper motion is detected at a 5a level or higher. The 
common proper motion pairs with orbital motion > 1" 
over 50 years are not selected because their USNO-B and 
SDSS positions place them outside the 1" search radius 
used by Munn et al. The angular separation of common 
proper motion pairs is limited to 9" < 8 < 30". Pairs of 
sources with 8 < 9" are usually blended in the USNO- 
B data and may not have reliable proper motion mea- 
surements (see Section f4.4[) . while the maximum angular 
separation between sources in the CAS "Neighbors" ta- 
ble defines the upper limit of 8 < 30". However, for pur- 
poses of Section POI we have created a sample of common 
proper motion pairs that extends to 8 = 500". We have 
done so by matching SDSS sources (that pass the qual- 
ity flags from Section 12. 2p within a 500" search radius 
into common proper motion pairs. Since this matching 
is computationally expensive, we have done this only for 
one sample. The r vs. g — i distributions of brighter and 
fainter components of kinematically-selected candidate 
binaries are similar to those shown in Figure [3] 

2.5. The Sample Completeness 

Before proceeding with the determination of photo- 
metric parallax relations and discussion of the properties 
of wide binary systems, we summarize the completeness 
of geometric and kinematic samples, and estimate their 
expected fractions of disk and halo populations. The 
samples are selected from a highly-dimensional space of 
measured parameters and an understanding of the selec- 
tion effects is a prerequisite for determining the limita- 
tions of various derived statistical properties. For exam- 
ple, the geometric sample is selected using five param- 
eters: the g — i color of the two components, (g — i)i 
and (g — i)2, their apparent magnitudes, r\ and r2, and 
their angular separation on the sky, 8. The latter three 
can be transformed with the aid of a photometric par- 
allax relation into a difference of their apparent mag- 
nitudes, Am = r% — r%, distance D, and the projected 
physical separation, a. We seek to constrain the pho- 
tometric parallax relation by minimizing the difference 
5 = AM — Am, where AM is a two-dimensional func- 
tion of (g — i)\ and (g — 1)2 (Section[3]), and at the same 
time derive constraints on the two-dimensional color dis- 
tribution of wide binaries, on their a distribution, and 
on any variation of these distributions with position in 
the Galaxy (Section |4j. Not all of these constraints can 
be derived independently of each other, and most are 
subject to severe selection effects. By judiciously select- 
ing data subsets and projections of this five-dimensional 
parameter space, these effects can be understood and 
controled, as described below. 

To illustrate the most important selection effects, we 
employ the photometric parall ax relation and its d epen- 
dence on metallicity derived bv llvezic et a l. (20084 here- 
after I08a). The quantitative differences between their 
photometric parallax relation and the ones derived here 
have negligible impact on the conclusions derived in this 
Section. For simplicity, we select a sample of ~2.8 million 
stars with r < 21.5 observed towards the north Galac- 



tic pole (b > 70°), and study their counts as a function 
of distance and the g — i color. Due to this choice of 
field position, the distance to each star is approximately 
equal to its distance from the Galactic plane (for a de- 
tailed study of the dependence of stellar number density 
on position within the Milky Way, see J08). Figured] 
illustrates several important selection effects. 

First, for any g — i color there is a minimum and max- 
imum distance corresponding to the SDSS saturation 
limit at r ~ 14 and the adopted faint limit at r — 21.5; 
the probed distance range extends from 100 pc to 25 kpc. 
Within the distance limits appropriate fo r a given color, 
the s ample is essentially complete f^98%. lFinlator et al.1 
|2000[) . Second, these limits are strongly dependent on 
color: the bluest stars saturate at a distance of about 1 
kpc, while the reddest stars are too faint to be detected 
even at a few hundred pc. Equivalently, due to the finite 
dynamic range of SDSS apparent magnitudes, there is no 
distance range where the entire color range from the blue 
disk turn-off edge to the red edge of luminosity function is 
completely covered. At best, at distances of about 1 kpc 
the color completeness extends from the blue edge to the 
peak of luminosity function at g — i ~ 2.7. Third, when 
pairing stars into candidate binary systems, their color 
distribution at a given distance (the requirement that 
the differences of apparent and absolute magnitudes are 
similar places the two stars from a candidate pair into 
a narrow horizontal strip in the distance modulus (DM) 
vs. g — i diagram shown in Figure [4]) will be clipped: 
the ratio of the number of candidate binaries and the 
number of all single stars in the sample decreases at dis- 
tances significantly different from ~ 1 kpc because of a 
bias against blue-red pairs. 

The binary samples selected from the ~1 kpc distance 
range can be used to measure the two-dimensional color 
distribution of wide binaries, as well as to gauge the de- 
pendence of their a distribution on color. The depen- 
dence of the a distribution on distance from the Galac- 
tic plane can also be studied over a substantial distance 
range, but only under the assumption that it is indepen- 
dent of color. 

The imposed 8 range (3" to 30") limits the range of 
probed physical separation to values proportional to dis- 
tance, and ranging from 3,000 AU to 30,000 AU at a 
distance of 1 kpc. We discuss and account for these ef- 
fects in more detail in Section [4~3l 

2.6. The separation of disk and halo populations 

The counts of main-sequence stars shown in Figure [4] 
include both disk and halo populations. With the avail- 
able data, there are three methods that might be used 
for separating stars (including candidate binary systems) 
into disk and halo populations (Juric et al., in prep.): 

1. A statistical method based on the stellar number 
density profiles (J08): beyond about 3 kpc from the 
plane, halo stars begin to dominate. However, as 
shown in Figure 21 only stars bluer than g — i = 2 
are detected at such distances. The stellar number 
density profiles suggest that the fraction of halo 
stars is below ~20% closer than 1.5 kpc from the 
Galactic plane (see Figure 6 in I08a). 

2. Classification based on metallicity into low- 
metallicity ([Fe/H] < —1) halo stars and higher 
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metallicity stars. As shown by I08a, this is a robust 
and accurate method even when using photomet- 
ric metallicity estimator, but it works only for stars 
with g — i < 0.7 due to the limitations of the pho- 
tometric metallicity method, and the SDSS spec- 
troscopic metallicity is available only for a small 
fraction of stars in the candidate samples. 

3. Kinematic selection based on proper motion 
measurements, and implem ented via a reduced 
proper motion dia gram (e.g.. lSalim fc C*ouldll2f)0l : 
M unn et all I2004L and references therein). How- 
ever, as discussed in detail in Appendix B, this 
method is robust only closer than 2-3 kpc from the 
Galactic plane due to a rotational velocity gradient 
of disk stars which diminishes kinematic differences 
between halo and disk stars further away from the 
plane. 

Given the limitations of these methods, it is not 
possible to reliably separate disk and halo populations 
throughout the explored parameter space, and in both 
geometric and kinematic samples. For geometric sample, 
the third method is not applicable because SDSS-POSS 
proper motions are not reliable at small angular distances 
(0 9"; see Section |4~4| . The requirement g — i < 0.7 
required for the second method results in a subsample 
with too narrow a color range to constrain the photo- 
metric parallax relation. Nevertheless, the analysis of 
this subsample based on results from I08a indicates that 
fewer than 10% of stars in geometric sample belong to 
halo population (this fraction increases with the distance 
from the Galactic plane; see Figure 6 in I08a), and thus 
we expect that halo contamination plays only a minor 
role in the geometric sample. 

The kinematic sample is expected to include a non- 
negligible fraction of halo stars due to the selection of 
stars with substantial proper motions. We use the re- 
duced proper motion diagram to estimate the fraction of 
halo candidate binary stars in this sample. The reduced 
proper motion for an arbitrary photometric bandpass, 
here r, is defined as 

trpm = r + 51og(/i), (5) 

where ix is proper motion in arcsec yr~ x (sometimes an 
additional offset of 5 mag is added) . Using a relationship 
between proper motion, distance and tangential velocity, 

v t = 4:A7fiD (6) 

and 

r-M r = 51og(£>)-5, (7) 

Equation [5] can be rewritten as 

trpm = M r + 5 log (v t ) + C, (8) 

where D is distance in parsec, M r is the absolute mag- 
nitude, and vt is the heliocentric tangential velocity (the 
projection of the heliocentric velocity on the plane of the 
sky), and C is a constant (C = —8.25 if v t is expressed 
in km s _1 ). Therefore, for a population of stars with the 
same vt , the reduced proper motion is a measure of their 
absolute magnitude. As shown using similar data as dis- 
cussed here, halo and disk stars form two well-defined 
and separated sequen ces in the reduced proper motion 
vs. color diagram fe-g- lSalim fc Gouldll200a iMunn et all 



l2004t and references therein). We discuss the impact of 
different metallicity and velocity distributions of halo and 
disk stars on their reduced proper motion distributions 
in more detail in Appendix B. 

Figure [5] shows reduced proper motion diagrams for 
stars observed towards the north Galactic pole, con- 
structed for two ranges of observed proper motion: 15-50 
mas yr~ l and 50-400 mas yr^ 1 . The choice of the proper 
motion range, together with unavoidable apparent mag- 
nitude limits, strongly affects the probed distance range: 
the larger is the proper motion, the closer is the distance 
range over which the selection fraction is non-negligible. 
We find that the two sequences closely follow the ex- 
pectations based on the analysis of metallicity and ve- 
locity distributions from I08a. The halo sequence can 
be efficiently separated by selecting stars with reduced 
proper motion larger than a boundary generated using 
the photometric parallax relation from I08a, evaluated 
for the median halo metallicity ([Fe/H] = —1.5) and 
with vt—180 km s _1 (see Equation [8|) . This separation 
method is conceptually id e ntical to the r\ separator dis- 
cussed by ISalim k, Gould! (|2003t ). They also proposed 
to account for a shift of the reduced proper motion se- 
quences with galactic latitude, an effect which we discuss 
in more detail in Appendix B. For the reasons described 
there, to account for the variation of the reduced proper 
motion sequences away from the Galactic pole, we sim- 
ply offset the v t value from 180 km s _1 to 110 km s _1 
(i.e., the separator moves upwards in Figure[5]by 1 mag). 
While this selection removes some disk binaries, it is de- 
signed to exclude most of halo binaries from the sample. 

With the aid of reduced proper motion separator, we 
separate kinematic sample into candidate halo (1,336 
pairs) and disk binaries (10,112 pairs). This fraction of 
halo systems is consistent with the above estimate ob- 
tained for the geometric sample. To assess selection ef- 
fects, we first investigate the sample of single stars. The 
top left panel in Figure [5] shows the fraction of all the 
stars shown in Figure [5] that have proper motion larger 
than 15 mas yr~ l and r < 19.5 (the latter limit en- 
sures the SDSS-POSS proper motion catalog complete- 
ness above ~90%). The selection efficiency is a strong 
function of distance, and falls from its maximum of ^95% 
for nearby stars to below 50% at a distance of about 1 
kpc. The candidate disk stars are detected in significant 
numbers to ^3 kpc, and halo stars beyond ~1 kpc. The 
fraction of selected stars that are classified as halo stars is 
below 20% closer than ~1.5 kpc from the Galactic plane, 
and becomes essentially 100% beyond 3 kpc. 

The kinematic difference between halo and disk stars 
is blurred at distances beyond 2-3 kpc (see Appendix 
B), and the majority of disk stars at such distances are 
misidentified as halo stars (the metallicity distribution 
implies that disk stars do exist at distances as large as 7 
kpc from the Galactic plane, see Figure 10 in I08a). To 
demonstrate this effect, we use subsamples of candidate 
disk and halo binaries identified using the reduced proper 
motion diagram that have 0.2 < {g — r)\ < 0.4. For these 
pairs it is possible to estimate photometric metallicity 
(I08a) and use it as an independent population classi- 
fier. Figure [7] shows that practically all candidate bina- 
ries with [Fe/H] > — 1 further than ^2 kpc from the 
Galactic plane are misclassified as halo stars when using 
reduced proper motion diagram. 
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In summary, geometric sample is heavily dominated by 
disk binaries, with halo contamination all but negligible 
closer than about 2 kpc from the plane. Kinematic sam- 
ple becomes severely incomplete (<50%) further than ^2 
kpc from the plane, and has a higher fraction of halo bi- 
naries than geometric sample, at a given distance from 
the plane. However, this halo contamination can be ef- 
ficiently removed using the reduced proper motion di- 
agram. Unfortunately, the number of selected halo bi- 
naries is insufficient in number (1,336 in kinematic and 
5,556 in geometric sample), and spans too narrow a color 
range to robustly constrain the photometric parallax re- 
lation. Therefore, both samples of candidate binaries are 
supposed to yield similar photometric parallax relations, 
because both are dominated by disk stars. 

3. THE PHOTOMETRIC PARALLAX ESTIMATION 
METHOD 

In principle, both the normalization and the shape 
of the photometric parallax relation (i.e., the shape 
of the main sequence in the Hertzsprung-Russell di- 
agra m) vary as a function of color and metallic - 
ity (|Laird. Carney fc Lathaml 119881 : ISiegel et alJ 12002( 1 . 
Since our data do not allow a reliable estimate of metal- 
licity over the entire range of observed colors, we can 
only estimate the "mean" shape of the photometric par- 
allax relation as a function of color, for all metallicities 
present in the sample. Such a mean shape is approx- 
imately an average of individual [Fe/i7]-dependent re- 
lations, weighted by the sample metallicity distribution. 
J08 derived such "mean" photometric parallax relations 
appropriate at the red end for the nearby, metal-rich 
stars, and at the blue end for distant, metal-poor stars. 
I08a discuss the offset of photometric parallax relation 
as a function of metallicity (see their Figure 20), and de- 
rived the metallicity range implied by "mean" photomet- 
ric parallax relations from J08. The derived metallicity 
range is consistent with the spatial distribution of metal- 
licity derived by I08a and the color-magnitude limits of 
the SDSS survey. 

3.1. The Photometric Parallax Parametrization 

We adopt the J08 polynomial r — i parametrization of 
the photometric parallax relation 

M r (r-i\p) = A+B(r-i)+C{r-i) 2 +D(r-i) 3 +E{r-i) 4 , 

(9) 

where p = (A, B,C, D, E) are the parameters we wish to 
estimate. To improve their accuracy, Juric et al. used a 
maximum likelihood technique to estimate the r — i color 
from the observed g — r and r — i colors. Because of the 
brighter flux limit employed here, we use the measured 
g — i color to derive a best estimate of the r — i color via 
a stellar locus relation (J08): 

g-i = 1.39(l-ea;p[-4.9(r-i) 3 -2.45(r-i) 2 -1.68(r-i)-0. 

(10) 

The r — i color estimate obtained with this method has 
several times smaller noise than the measured r — i color. 
This is because the observed dynamic range for the g — i 
color is much larger than of the r — i color (~ 3 mag 
vs. ~ 1 mag), while their measurement errors are similar. 

3.2. The Parameter Estimation Algorithm 



The goal of parameter estimation algorithm is to deter- 
mine the photometric parallax relation, M r (r— i|p), that 
minimizes the width of the distribution of 5 values for 
true binary systems, where 5 = (M r % — M r \) — (ra — T\). 
The x 2 minimization cannot be used for this purpose 
because random pairs, if not removed from the sample, 
will strongly bias the best-fit M r . The available color, 
angular separation, and proper motion information are 
insufficient to separate the random pairs from the true 
binaries. Therefore, we need to design a fitting algorithm 
that will be least affected as possible by random pairs. 

We begin by studying the behavior of S values in mock 
catalogs. The first step in creating a mock catalog is the 
selection of 51,753 (random) pairs from the random sam- 
ple. Note that the fraction of true binaries in the random 
sample is only ~ 0.25% (see Section |4TT|) . True binaries 
are then "created" in the mock catalog by replacing the 
observed r 2 magnitudes for 20% of pairs with 

r 2 = n + (M r2 - Mn) + N(0, 0.1), (11) 

where M r — M r (r — i|po) and po = 
(3.2,13.30,-11.50,5.40,-0.70) (Equation 2 coeffi- 
cients from J08). The AT(0,0.1) is a Gaussian random 
variate added to account for the intrinsic scatter around 
the photometric parallax relation. The result of this 
process is a mock sample of candidates where 20% of 
pairs are "true" binaries, and the rest (80%) is the 
contamination made of random pairs. The distribution 
of S values for "true" binaries is, by definition, a 0.1 mag 
wide Gaussian centered on zero when M r = M r (r— i\po). 

Figure [8] (top) shows the distribution of S values for the 
mock sample evaluated with the "true" [M r (r — i\po)] 
photometric parallax relation. The observed S distri- 
bution can be described as a sum of a Gaussian and 
a non-Gaussian component. The non-Gaussian compo- 
nent is due to random pairs (the contamination), while 
the Gaussian component (0.1 mag wide and centered on 
zero) is due to the true binaries. 

When an M r relation different from the "true" (or 
best-fit) M r is adopted, the Gaussian component be- 
comes wider and the peak height of the 5 distribution de- 
creases, as shown in Figured] (bottom). At the same time, 
the peak height of the 5 distribution of the contamination 
changes much less since the distribution is much wider 
(~ 2.3 mag wide). Therefore, minimizing the width of 
the S distribution of true binaries, is equivalent to max- 
imizing the peak height of the entire 6 distribution. We 
quantify this peak height as the number of candidate bi- 
naries in the most populous S bin. 

3.3. The Algorithm Implementation 

To robustly explore the parameter space that defines 
the photometric parallax relation, and to find the best-fit 
coefficients p, we implement our algorithm as a Markov 
.chain .Monte Carlo (MCMC) process. The MCMC de- 
scription given here and our implement ation of the al- 
gorithm are based on ex ample s given by iTegmark et all 
$Ml),IIbrd| dill), and|Cro3 (12001 . 

The basic idea of the MCMC aproach is to take an n- 
step intelligent random walk around the parameter space 
while recording the point in parameter space for each 
step. Each successive step is allowed to be some small 
distance in parameter space from the previous position. 
A step is always accepted if it improves the fit, and is 
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sometimes accepted on a random basis even if the fit is 
worse, where the goodness of the fit is quantified by some 
parameter (usually with x 2 )- The random acceptance 
of a bad fit ensures that the MCMC does not become 
stuck in a local minimum, and allows the MCMC to fully 
explore the surrounding parameter space. 

We start a Monte Carlo Markov chain by setting all 
coefficients from Equation [9] to zero (pi = 0). Using this 
initial set of coefficients we evaluate S — (M r2 — M rl ) — 
(?"2 — ri) for all candidate binaries assuming M r (r— i\pi), 
and bin S values in O.f mag wide bins. The number of 
candidate binaries in the most populous bin, Pj, is used 
to quantify the relative goodness of the fit. 

Given pi, a new candidate step, p n = pi+ Ap, is gener- 
ated, where the step size, Ap, is a vector of independent 
Gaussian random variates with initial widths, cr, set to f . 
Using the candidate set of coefficients, p n , 5 values are 
evaluated, binned, and the height of the <5 distribution is 
assigned to parameter P n . 

Following t he Metropol i s-Hast ings rule 

([Metropolis et alj Il953t lHastingsl 119701) the candi- 
date step is accepted (pi+i = p n , P»+i = P n ) if Pn > P 
or if exp(P n — Pi) > £, where £ is a random number 
between and 1 (£ G [0,1]). Otherwise, the candidate 
step is rejected. 

While the Metropolis-Hastings rule guarantees that the 
chain will converge, it does not specify when the conver- 
gence is achieved. The speed of the convergence depends 
on the Gaussian scatter a used to calculate the step size 
Ap. If the scatter is too large, a large fraction of can- 
didate steps is rejected, causing the chain to converge 
very slowly. If the scatter is too small, the chain behaves 
like a random walk, and the number of steps required 
to traverse some short distance in the parameter space 
scales as I /a 2 . The choice of optimal Gaussian scatter 
a (for each fitted coefficient), as a function of the posi- 
tion in the parameter space, is not trivial and it can be 
very complicated even if the fitted coefficients are uncor- 
rected. 

To determin e the optimal a values we follow the 
iTegmark et al.l (j2004h prescription (see their Appendix 
A). After every 100 accepted steps we compute the coef- 
ficient covariance matrix C = (pp*) — (p)(p*) from the 
chain itself, diagonalize it as C = RAR*, and use it 
to calculate a new step size Ap' = R t A 1 / 2 Ap for each 
coefficient separately. We find that this transformation 
greatly accelerates the convergence of a chain. 

Due to the stochastic nature of the MCMC, the best- 
fit relations (coefficients with the highest Pi value in a 
chain) from different chains will not necessarily be the 
same. To quantify the intrinsic scatter between different 
best-fit relations, we run fifty 10,000-element long chains, 
and select the best-fit coefficients from each chain for 
subsequent comparison (see Section I3.4|) . The proper 
mixing and convergence of chai ns is confirmed us i ng th e 
Gelman & Rubin R statistic (jGelman fe Rubira Il992f ) . 
Gelman & Rubin suggest running the chains until R < 
1.2 for all fitted coefficients. With 10,000 elements in 
each chain, we obtain R < 1.01 for all fitted coefficients. 

In the end, we select p = (A, B, C, D, E) with the high- 
est Pi value among all chains as our best-fit relation. The 
constant term A is not constrained with our algorithm, 
because A (from M r 2 and M r x) cancel out when evaluat- 



ing 5. Instead, we constrain A by requiring M r = 10.07 
at r — i = 1.1, obt ained from trigonometric parallaxe s of 
nearby M dwarfs ([West. Walkowicz. fe Hawlevll2005D . 

3.4. Algorithm Robustness Test 

To test the robustness of our algorithm, we apply it to 
the mock sample described in Section 13.21 The best-fit 
relations (obtained from Markov chains) are compared 
on a 0.1 ^ r — i ^ 1.5 grid in 0.01 mag steps. We find 
an rms scatter of 0.05 mag between Markov chains, and 
0.05 mag rms scatter between the true and the best- fit 
relation with the highest Pi value. 

We repeat this test with a mock sample containing 30% 
of true binaries. The rms scatter between the best-fit 
relations decreases to 0.03 mag, and the rms scatter be- 
tween the true and the best-fit relation with the highest 
Pi value decreases to 0.01 mag. 

Even when only 20% of sources are true binaries (i.e., 
contamination by random pairs is 80%) our algorithm 
recovers the "true" photometric parallax relation at the 
0.05 mag (rms) level. The accuracy of the fit increases 
(to 0.01 mag rms) as the contamination decreases (from 
80% to 70%). 

3.5. Best-fit Photometric Parallax Relations 

We apply the method described in Section I3~3l to two 
samples of candidate binaries and obtain the best-fit pho- 
tometric parallax relations 

M r = 3.32+15.02(r-i)-18.58(r-i) 2 +13.28(r-i) 3 -3.39(r-i) 4 

(12) 

M r = 3.42+13.75(r-i)-15.50(r-i) 2 +10.40(r-i) 3 -2.43(r-i) 4 

(13) 

for the geometrically- and kinematically-selected sam- 
ples, respectively. Candidate halo binaries were removed 
from the kinematically-selected sample using reduced 
proper motion diagrams (Section I2.6|) before the Equa- 
tion [T3] was derived. The photometric parallax rela- 
tions for halo stars cannot be robustly constrained using 
geometrically- or kinematically-selected halo binaries be- 
cause the color range they span is too narrow {g — i < 1.0 
at 3-4 kpc, see Figures [4] and [6J . 

We test the correctness of the shape by studying the 
dependence of median 6 values on the g — i colors of the 
brighter and the fainter components. If the shape of these 
photometric parallax relations is correct, the distribution 
of d values will be centered on zero, and the individual 
S values will not correlate with color. The medians are 
used because they are more robust to outliers (random 
pairs in the sample). We start by calculating 5 values 
for each candidate binary sample (using the appropriate 
M r relation), and then select candidates with \5\ < 0.4. 
This cut reduces the contamination by random pairs, as 
demonstrated in Section [3T61 The selected candidate bi- 
naries are binned in g — i colors of the brighter and the 
fainter component, and the median 5 values are shown 
in Figured] 

The distributions of the median 5 for each pixel are 
fairly narrow (0.07 mag), and centered on zero. Irrespec- 
tive of color and the choice of the two best-fit photometric 
parallax relations, the deviations are confined to the 0.25 
mag range, placing an upper limit on the errors in the 
mean shape of the adopted relations. 



In Figure fTtJl we compare the adopted photometric par- 
allax relations to J08 "faint" 

M r = 4.0+11. 86(r-i)-10.74(r-i) 2 +5.99(r-i) 3 -1.20(r-i) 4 

(14) ' 

and "bright" 

M r = 3.2+13.30(r-i)-11.50(r-i) 2 +5.40(r-i) 3 -0.70(r-i) 4 

(15) ' 

photometric parallax relations. The rms difference be- 
tween Equations [T2l and [TBI and Equation [T5l is ~ 0.13 
mag, comparable to the rms difference between our Equa- 
tions [12] and [13] (~ 0.13 mag). The maximum difference 
between Equations [T2"l and [TBI and Equation [15] is ~ 0.25 
mag, again comparable to the maximum difference be- 
tween our Equations [T2l and [T3l (~ 0.25 mag). The dif- 
ferent color distributions of the two samples, shown in 
Figure [TT1 together with metallicity effects, is the most 
likely explanation for differences between the two photo- 
metric parallax relations. 

3.6. The Analysis of the Scatter in Predicted Absolute 
Magnitudes 

The scatter in S values can be expressed as 

(S 2 ) = ((AM - Am) 2 ) « (AM 2 ) + (Ar 2 ), (16) 

where (AM 2 ) is the scatter in predicted absolute mag- 
nitudes, and (Am 2 ) is the scatter in measured appar- 
ent magnitudes. Since the photometric uncertainties of 
SDSS are well understood, the intrinsic scatter around 
the M r (r — i) relation is possible to measure and charac- 
terize. 

In Figure [12] we plot the observed distributions of 8 
values for the geometrically- and kinematically-selected 
binaries, and overplot the 8 distribution of the random 
sample. The 8 values for the random sample were calcu- 
lated with Equations 1 1 2 1 and [TBI respectively. The 8 dis- 
tribution of the random sample was fitted to the observed 
8 distribution in the \8\ > 1 range using the Kolmogorov- 
Smirnov test. 

By comparing the random and the observed 8 distri- 
butions, we find that the two match well for \8\ > 1 
(the Kolmogorov-Smirnov test reports P ~ 0.95), indi- 
cating that candidate binaries with \8\ > 1 are almost 
certainly random pairs. On the other hand, as 8 ap- 
proaches zero, the two distributions become remarkably 
different (P ~ 10~ 7 for \8\ < 1), indicating that these 
candidate binaries are dominated by true binary systems, 
and not by random pairs. 

The 8 distribution for true binaries (Figure [12] dashed 
line), obtained by subtracting the random from the ob- 
served 8 distribution, is clearly not Gaussian. It can be 
modeled as a sum of two Gaussian distributions ("nar- 
row" and "wide") centered close to zero, and about 0.1 
mag and 0.55 mag wide. The centers, widths, and ar- 
eas for the best-fit Gaussian distributions are given in 
Table [Q 

To determine the consistency of the observed scatter 
with photometric errors, we normalize the 8 values for 
the kinematically-selected sample with expected formal 
errors, 

= « r2 + ° 2 M rl + < + <) 1/2 > (17) 

and plot the 8 /as distribution in Figure [TB"1 The 8 /as 
distribution for true binaries is not a Gaussian with a 



width of 1, as we would expect if the scatter in the 8 
distribution was only due to photometric errors in the 
gri bands (note that the expected random error in M r 
is about 5-10 times larger than the random error of the 
g — i color because dM r /d(g — i) varies from ~10 at the 
blue edge to ~ 5 at the red edge). 

The width of 8 /as distribution for the geometrically- 
selected candidate binaries is about 3 times smaller 
than in the kinematically-selected sample. We find that 
this is due to overestimated photometric errors in the 
geometrically-selected sample, as shown in Figure [T4l 
The overestimated photometric errors in the gri bands 
overestimate the expected formal error as, and the over- 
all 8 / as distribution is too narrow. We speculate that the 
small angular separation (~ 3") between the components 
is the cause of overestimated photometric errors (perhaps 
due to sky background estimates). The small angular 
separation of components does not affect the magnitudes 
of stars in the geometrically-selected sample. If it did, 
the two (5 distributions would be significantly different 
which, as shown in Figure [T2l is not the case. 

The observed non-Gaussian scatter in predicted ab- 
solute magnitudes may be due to photometric parallax 
variation as a function of metallicity. As noted at the 
beginning of Section [3] we can only estimate the "mean" 
shape of the photometric parallax relation. Since the 
intrinsic photometric parallax for a given wide binary 
system is different from the mean relation, AM (the dif- 
ference of predicted absolute magnitudes) and Am (the 
measured difference of apparent magnitudes) will differ. 
This discrepancy will increase for systems where the com- 
ponents have significantly different colors. 

To test the assumption that the shape of photometric 
parallax relation increases the scatter in predicted abso- 
lute magnitudes, we use the mock sample constructed in 
Section l3~2l and add a color-dependent offset to apparent 
magnitudes 

r'i = n + ag-i)i (18) 

r' 2 = r 2 +a9-ih (19) 

where £ is a random number between zero and one (the 
same for both components). These color-dependent off- 
sets simulate the change in the shape of the photometric 
parallax relation due to metallicity. We apply the algo- 
rithm described in Section 13.31 to the mock sample, and 
obtain a revised photometric parallax relation. Using 
this relation, we analyze the distribution of 8 values and 
find that it can be modeled as a sum of two Gaussians 
centered on zero, with widths of 0.1 and 0.3 mag. This 
result suggests that the non-Gaussian scatter observed 
in candidate samples may be caused by the difference 
between the shapes of the mean photometric parallax re- 
lation and a true relation for a given metallicity (and 
perhaps other effects, such as age). 

This model-based conclusion is consistent with a di- 
rect comparison of relations derived here and the rela- 
tions from I08a evaluated for the median halo metallic- 
ity ([Fe/H] = —1.5) and the median disk metallicity 
([Fe/H] = —0.7 for distances probed by our sample; see 
Figure 5 in the above paper). The two relations corre- 
sponding to halo and disk stars are offset by 0.6 mag 
due to metallicity difference. Our relations match the 
low-metallicity relation at the blue end and the high- 
metallicity relation at the red end. Therefore, in the 
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worst case scenario of extremely blue (r — i = 0.3) and 
red (r — i = 1.4) disk stars, the maximum error in the 
difference of their absolute magnitudes is 0.6 mag. When 
convolved with the observed color distribution of pairs, 
the expected scatter is about 0.2-0.3 mag, consistent with 
the observed and simulated widths of the 8 distributions. 

Unresolved binarity of components in candidate sam- 
ples may also contribute to the non-Gaussian scatter in 
predicted abs olute magnitudes. The mul tiplicity studies 
of G dwarfs (iDuquennov fe Mavorlll991h and M dwarfs 
(|Fischer fc Marcvlll992f ) find that a significant fraction 
of G and M dwarf stars (40 — 60%) are unresolved binary 
systems. If a component of a wide binary system is an 
unresolved binary system, its luminosity will be underes- 
timated (with the magnitude of the offset depending on 
the actual composition of the binary) and the 8 value for 
the wide binary system will systematically deviate from 
zero. In Appendix [Cl we model the presence of unresolved 
binaries in wide binary systems, and find that the model 
can explain the observed 8 scatter. 

Therefore, both the intrinsic variations of the photo- 
metric parallax relation and unresolved binaries can ex- 
plain the observed non-Gaussian scatter of 8. The data 
discussed here are insufficient to disentangle these two 
effects. 

Finally, the uncertainty in predicted absolute mag- 
nitudes (error distribution for photometric parallax 
method) can be obtained by drawing random values, x, 
from a non-Gaussian distribution 

f{x) = A 1 N{x\iix,<txI>/2) + A 2 N(x\fi 2 ,a 2 /V2), (20) 

where N(x\fi, a) are Gaussian distributions, and the 
best-fit parameters are listed in Table [TJ 

4. THE PROPERTIES OF WIDE BINARIES 

The best-fit photometric parallax relation can be uti- 
lized to further refine the samples of candidate bina- 
ries and to address questions about their dynamical and 
physical properties such as 

• Do wide binaries have the same spatial distribution 
as single stars? 

• Do wide binaries have the same color distribution 
as single stars? 

• Are the color distributions of components in wide 
binary systems consistent with random pairings? 

• What is the distribution of semi-major axis for wide 
binaries? 

• Does the distribution of semi-major axis vary with 
the position in the Galaxy? 

4.1. High- Efficiency Samples of Candidate Binaries 

We use the best-fit photometric parallax relations to 
select samples of candidate binaries with high selection 
efficiency (high fraction of true binaries) by imposing fur- 
ther constraints on 8 values in geometric and kinematic 
sample. 

As shown in Figure [T2l the fraction of random pairs in 
the candidate sample is simply A ran d om / 'A t> serve d, where 
Arandom and A b ser ved are the integrals of the random 
(triangles) and total (thick solid line) 8 distributions. 



The fraction of true binaries, or the selection efficiency, 
is then 

£ 1 A ran( iom / A i) Serve( i (^-Q 

Without a cut on 8, the fraction of true binaries 
(the selection efficiency) in the geometrically- and 
kinematically-selected samples is 34% and 35%, respec- 
tively. It is reassuring to find that the e value for the 
geometrically-selected sample obtained here, and the one 
measured in Section |2~31 match so well (at a 1% level), 
even though the two methods for estimating e are inde- 
pendent. 

The selection efficiency of 35% for the kinematically- 
selected sample may seem low, given that only 0.5% of 
random pairs pass the common proper motion criteria. 
This points to a low fraction of true binaries with angu- 
lar separation greater than 15". If this fraction is about 
1/400 (0.25%), the common proper motion criteria will 
select 2 random pairs (0.5% out of a 400), and only 1 
true binary system. Therefore, 66% of the sample (2 
out of 3) will be random pairs, and 34% (1 out of 3) 
will be true binary systems, similar to what we find for 
the kinematically-selected sample. The result that only 
1/400 pairs with 8 > 15" are true binaries puts the frac- 
tion of random pairs in the random sample at 99.75%. 

Figure [12] shows that the true binaries have a much 
smaller range of 8 values than the random pairs. There- 
fore, a cut on 8 would reduce the contamination, and 
increase the fraction of true binaries in a sample. By 
requiring \S\ < 0.4, we construct samples where 63% 
and 64% of candidates are true binaries. The num- 
bers of candidate binaries in these cleaner samples 
are 16,575 (geometrically-selected) and 5,157 candidates 
(kinematically-selected), with the expected total number 
of true binaries about 13,743. The sample efficiency for 
the geometric sample can be further increased to 90% by 
requiring \8\ < 0.2 and Z < 0.3 kpc, where Z is the height 
above the Galactic plane. Compared to the existing cat- 
alogs of wide binaries by | Chaname fe Gouldl (|200l and 
iLepine fc Bongiornol (|2007t ). our samples represent a 20- 
fold increase in the number of candidate binaries and 
probe much deeper into the halo (to ~ 4 kpc) . Although 
a non-negligible fraction of candidate pairs are due to 
random pairings (~ 35%), the increase in the number of 
potential physical pairs is substantial. 

We emphasize that our method only selects candi- 
dates where both components are main-sequence stars, 
while rejecting systems where one of the components has 
evolved off the main sequence. This is due to the pho- 
tometric parallax relation, as defined here, being correct 
for main-sequence stars only. Together with the small ex- 
pected fraction of giant stars in our sample due t o faint 
apparent magnitudes fl-2%. iFinlator et al.ll2000t I08a), 
this bias results in practically pure main-sequence sam- 
ple. We note that the application of a photometric par- 
allax relation that corresponds to some mean metallic- 
ity distribution introduces systematic errors in estimated 
M r . We partially mitigate this problem by averaging 
distances determined for each binary component (using 
Equation [T2"|) . Based on the behavior of photometric par- 
allax relations and 8 distribution discussed in Section ^. 61 
the systematic uncertainty in obtained distances is most 
likely not larger than 10-15% (an undcrstimate due to 
faint bias for blue stars) . Another source of overall sys- 
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tematic uncertainty in dista nces is the normalization of 
Equat ion fT2l adopted from I West. Walkowicz. fc Hawlevl 
(2005). This normalization corresponds to nearby (<100 
pc) metal-rich stars, while most stars in our sample are 
distances of the order 1 kpc. The disk metallicity gradi- 
ent discussed by I08a implies systematic distance overes- 
timate of about 10-20%, partially cancelling the above 
underestimate. These systematic uncertainties propa- 
gate as systematic uncertainties of derived semi-major 
axes discussed in Section |4~51 

4.2. The Color Distribution of Wide Binaries 

The luminosity of a main-sequence star, and thus its 
color via photometric parallax relation, can be used as 
a proxy for stellar mass. The color-color distribution of 
wide binaries, therefore, provides constraints on the dis- 
tribution of stellar masses in wide binary systems. To 
find the color distribution of wide binaries, we select 
a volume-complete (0.7 < d/kpc < 1.0) subsample of 
geometrically-selected candidate binaries with \S\ < 0.4, 
and plot their distribution in the (5 — 1)2 vs. (g— i)\ color- 
color diagram in Figure fT5l ( top). The sample is complete 
in the 0.4 < g-i < 2.8 color and 4,200 AU < a < 10,000 
AU semi-major axis range (see Section l4~3]) . Even though 
the \8\ < 0.4 cut increases the fraction of true binaries, 
about 14% of candidates (in the 0.7 < d/kpc < 1.0 range) 
are still random pairs that contaminate the map. To re- 
move the contamination, first we select pairs from the 
random sample (see the end of Sect ion l2~2f with \S\ < 0.4 
and 0.7 < d/kpc < 1.0. The \S\ < 0.4 cut on the ran- 
dom sample will not increase the fraction of true bina- 
ries (e) above ~ 1% because e decreases rapidly with 
9 (see Figure HH (middle left) in Section PO]) , and the 
pairs in the random sample have 9 > 20". The (g — 1)2 
vs. (g — i)± distribution of this random sample is shown 
in Figure [15] (middle). The maps are essentially proba- 
bility density maps as pixels sum to 1. To correct for the 
contamination in the top map, we multiply each pixel 
in the random map with 0.14 (that being the contam- 
ination in the candidate binary sample), and subtract 
two maps. The corrected map, presented in Figure [15] 
(bottom), shows that the color-color distribution of true 
binary systems is fairly uniform, has a local maximum 
around (g — 2)1.2 ~ 2.5, and reflects the underlying lumi- 
nosity function which peaks for red stars (c.f. Figure [4j> . 

The map shown in Figure [15] (bottom) describes the 
probability density, P[(g — (g — 1)2], of a wide binary 
system with components that have (g — i)\ and (g — 1)2 
colors falling into a given pixel. This probability density 
can be expressed as a product 

P[(ff-i)i, G/-O2] = P[{g-i) B \{g-i)A] P[{g-i) A ] (22) 

where P[(g — *)b|(<? — *)a] is the conditional probabil- 
ity density of having one component with (g — i)s color 
in a wide binary system where the other component has 
(g — i)a, and P[(g — i)a] is the probability density that 
a star with g — i = (g — i)a color is in a wide bi- 
nary system. These probability densities may also vary 
with Galactic coordinates (e.g., with the height above 
the Galactic plane), but we cannot study such effects di- 
rectly because the samples are volume-complete only in 
the 0.7 < d/kpc < 1.0 range. 

The conditional probability density, P[(g — i)s|(ff — 
i)a], can be extracted from Figure [T5l (bottom) map by 



selecting pixels where either (g — i)i = (g — i)a, or (g — 
1)2 = (9 - i)A- The resulting P[(g - i) B \(g - «)a] for 
several values of (g — z)a are shown in Figure [TH] Red 
stars ((g — i)a > 2.0) are more likely to be associated 
with another red star than with a blue star, while for blue 
stars the companion color distribution is flat. The best- 
fit analytic functions that describe the observed trends 
are given in Table [2] 

The probability density, P[(g — i)a], that a star with 
g — i = (g — i)a is in a wide binary system can be derived 
by comparing the g — i color distribution of stars in wide 
binary systems with the g — i color distribution of all 
the stars in the same volume. As shown in Figure [17] 
(top), the g — i color distribution of stars in the volume- 
complete wide binary sample roughly follows the g — i 
color distribution of all the stars in the same volume. 
The ratio of the two distributions (renormalized to an 
area of 1) gives the P[(g — i)a], and is shown in the 
bottom panel. 

The probability for a star to be in a wide binary system 
(P[(g — «)a]) is independent of its color. Given this color, 
the companions of red components seem to be drawn ran- 
domly from the stellar luminosity function, while blue 
components have a larger blue-to-red companion ratio 
than expected from luminosity fun ction. These results 
are co nsistent with recent results bv lLepine fc Bongiorno 
(]2007h . The overall fraction of stars in wide binary sys- 
tems is discussed in the next section. 

4.3. The Spatial Distribution of Wide Binaries 

If the semi- major axis distribution function, /(a), is 
known, the number of stars in wide binary systems can 
be determined by integrating f(a) from some lower cut- 
off, a\ , to the maximum semi- major axis, 02 . The power- 
law frequency distribution, f(a) cx a' 3 , (3 = —1, is known 

in the context of wide binaries as the Opik distribution 
fOD: lQ^[il)2"l . When semi-major axis distribution 
of wide binaries follows the OD, the frequency distri- 
bution of log(a) is a straight line with a slo pe of zero 
(jPoveda. Allen fc Hernandez- Alcantara! 12007). Alterna- 
tively, an equivalent representation of OD is the cumula- 
tive distribution N[< log(a)] cx log(a). In this form, OD 
is a straight line with a positive slope. We use the cu- 
mulative representation, instead of differential, because 
it reduces the counting noise in sparsely populated bins 
(though the errors become correlated between bins). 

We utilize geometrically-selected candidate binaries 
(see Section 14.41 for a discussion of the kinematic sam- 
ple), but do not limit the selection to 9 < 4", as we 
did in Section 12.31 Since a cx 9, the removal of upper 
limit on 9 allows us to probe an extended range of semi- 
major axes. The downside is that random pairs domi- 
nate at large 9 and a careful accounting for contamina- 
tion as a function of 9 is required before the the f(a) 
distribution can be constrained. Since we only know the 
projected separation of our pairs, we use a statistical 
relation to calculate the average semi- major axis, (a), 
as (a) = 1.411 9 d, where d is the heliocentric distance 
(|Couteaul I1960T ). Hereafter, we drop the brackets and 
simply note the average semi-major axis as a. 

Figure [T8l (top left) shows the cumulative distribution 
of log(a) for candidate wide binaries with |i5| < 0.2 se- 
lected from the 0.7 < Z/kpc < 1.0 range. The cumula- 
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tive distribution does not follow a straight line, as pre- 
dicted by the OD, but actually increases its slope with 
log(a). We assume that this is due to an increasing frac- 
tion of random pairs at high log(a), and proceed to verify 
this assumption. 

Figure [18] (top right) shows the differential distribu- 
tion of angular separation for the selected sample. For 
9 > Ornax the observed and random distributions closely 
match, demonstrating that random pairs dominate at 
high 9 (or high log(a)). To calculate how the fraction of 
true binaries (or random pairs) changes as a function of 
9, we fit fmd(Q) — CO to the observed histogram, and 
calculate the fraction or true binaries, e, using Equa- 
tion (see Section |2~3"]) . The calculated e values, as well 
as the best- fit second-degree polynomial, e(9), are shown 
in Figure [TBI (middle left). As an independent test, the 
selection efficiency was calculated using Equationl2"Tl (i.e. . 
from the 5 distribution) for three 0-selected subsamples, 
and the obtained values agree with e(8) at a level of 1%. 
The angular separation for which e falls below ~ 5% is 
defined as 9 max . The fraction of true binaries (e(0)) also 
changes as a function of Z , and is determined separately 
for different distance bins. 

Since the candidates are restricted in Z (Z min = 0.7 
kpc to Z max = 1.0 kpc in this example) and 9 (3" to 
9 m ax), to ensure a uniform selection in the Z vs. a space 
we define 

a min = 3"- 1.411- 1000 Z max (23) 

and 

<W = 9 max • 1.411 • 1000 Z mm / sin(45°) (24) 

as the minimum and maximum probed semi-major axis, 
shown as the selection box in Figure [T8l (middle right). 
The sin(45°) factor is to account for the fact that the 
candidates are restricted to high (b > 45°) Galactic lati- 
tudes. 

To correct the cumulative distribution of log(a), we 
assign a probability e(9) to each candidate binary in the 
a-min to a max range, and add the probabilities (instead 
of counting candidates) when making the cumulative 
log(a) distribution. The corrected cumulative distribu- 
tion, shown in Figure [T8l (bottom left), follows a straight 
line up to the turnover point, abreak- We define abreak as 
the average semi-major axis for which the straight line 
fit to the cumulative distribution deviates by more than 
1.5%. In addition to abreak, we also measure the slope of 
the cumulative distribution where it follows the straight 
line. It can be shown that the slope of the cumulative 
distribution is equal to the constant of proportionality, 
No, in Opik distribution, f(a) = No/a. The number of 
binaries can be calculated by integrating f(a) from a\ 
to <Z2, and we obtain Nun — No log(a2/ai). For inte- 
gration limits we choose ai = abreak where we assume 
that systems with semi-major axes greater than abreak 
are no longer bound, and a\ = 100 AU (since a 2 <X\, 
the results are not very sensitive to the choice of a{). 

The uncertainty in abreak, shape of f(a) (or power-law 
index f3), and number of binaries (Nun) are estimated 
using Monte Carlo simulations. We find that the uncer- 
tainty in abreak is less than 0.1 dex, and the error on 
the power-law index ((3) is < 0.1. The uncertainty in 
measuring Nbi n is about 10%. The corrected cumulative 



log(a) distribution obtained from one of these simula- 
tions is shown in Figure [18] (bottom right). The semi- 
major axis distribution of "true" binaries in the simu- 
lation sample is f(a) oc a~ 8 , and is valid between 100 
AU and abreak — 10, 000 AU. The turnover in the dis- 
tribution happens because there are no "true" binaries 
above 10,000 AU, only random pairs, similar to what we 
observe in real data. This similarity is a strong warning 
not to over-interpret the slope of f(a) beyond abreak- 

To estimate the dependence of $ (shape of f(a)), 
abreak, and Nq on color, we divide the 0.7 < Z/kpc < 1.0 
sample into three color subsamples using (g — i)\ = 1.8 
and (g — 1)2 — 1.5 lines. We find that f(a) follows OD 
in all three subsamples (f3 — — 1), and that the average 
o-break is 3.99, with a 0.07 root-mean-square scatter. The 
abreak for the full 0.7 < Z/kpc < 1.0 sample is 4.02. 
These results suggest that abreak and the shape of f(a) 
are independent of color of binaries. The No value, and 
subsequently the number of binaries, will depend on the 
sample's color range. For the full 0.7 < Z/kpc < 1.0 
sample, the number of binaries is 

N bin = (No 1 + Ni + A 3 ) log 10 (o2/oi), (25) 

where Nq, i = 1,2,3, are No values measured for each 
color subsample. Therefore, the number of binaries cal- 
culated for a distance bin will change as the color range 
changes. Assuming that the g — i color distribution of bi- 
naries does not change with Z , we can use the g — i color 
distribution for the 0.7 < Z/kpc < 1.0 sample (solid line 
in Figure [17] (top)), to correct for color incompleteness. 
We also assume that the fraction of binaries outside the 
0.4 < g—i < 2.8 color range is small. The correct number 
of binaries is then 

N bin = N /A[(g - i) min , (g - i) max ] log 10 (a 2 /ai), (26) 
where A[(g — i) m i n , (g — i)max\ is the area underneath the 
solid line histogram in FigurefTTlftop). between (g—i)min 
and (g — i) m ax (g — i color range for a given distance bin). 

The estimated systematic error in abreak due to the 
choice of the \S\ cut is measured using \S\ < 0.1 and \S\ < 
0.4 samples. We find that abreak changes by < 0.03 dex 
between these samples. This result suggests that abreak 
is not sensitive to the choice of the \S\ cut. Similarly, the 
change in abreak is less than 0.03 dex if the estimate of 
e(9) is off by ±0.1 (~ 10% change) from the best-fit e(9). 

To establish whether semi-major axis distribution fol- 
lows the OD in other Z bins, we repeat the f(a) and 
o^break measuring procedure on 8 Z bins, and show the 
corrected cumulative distributions with best-fit straight 
lines in Figure 1191 In general, the corrected cumulative 
distributions follow a straight line, and then start to de- 
viate from it at a bre ak- In the 0.1 < Z/kpc < 0.4 bin we 
do not see a turnover due to a narrow range of probed 
projected separations (9 max = 16" limits the range to 
3193 AU, see Figure [20]) . and only determine the upper 
limit on a break . 

As the average height above the Galactic plane in- 
creases, the abreak moves to higher values. We investigate 
this correlation in more detail in Figurel2Tl (top left). The 
data follow a straight line log(ab rea k) = k \og(Z[pc\) + 1, 
where k = 0.72 ± 0.05 and I = 1.93 ± 0.15, or ap- 
proximately, abreak[AU] — 12, 300 Z[kpc}°- 7 in the 0.3 < 
Z/kpc < 3.0 range. 

It is possible that abreak also depends on the cylindrical 
radius, R, with the Sun at Rq—8 kpc, and perhaps on the 
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local density of stars, p. Because the sample is dominated 
by stars at high Galactic latitudes, it is hard to disentan- 
gle the Z dependence from the other two effects (the R 
range is small, and p varies strongly with Z). We attempt 
to do so using the volume-complete 0.7 < Z/kpc < 1.0 
sample. First we divide this sample into three subsam- 
ples with median Galactic latitudes, (6), of 35°, 49°, and 
80° and determine abreak for each subsample. The best- 
fit cibreak varies by ~0.3 dex between the low-latitude 
and high-latitude subsample, despite the same median 
Z. When the 0.7 < Z/kpc < 1.0 sample is divided into 
the Galactic anticenter (90° < I < 270°) and the Galac- 
tic center (/ > 270° or I < 90°) subsamples, the best-fit 
abreak varies by ~0.1 dex. These variations suggest that 
the best-fit Z dependence does not fully capture the be- 
havior of abreak- Nevertheless, they are smaller (< 0.3 
dex) than the observed variation of abreak (~1 dex). 

The spatial distribution of wide binaries can now be 
compared to the number density of all stars as a func- 
tion of height above the Galactic plane. In Figure [21] 
(bottom left) we show that wide binaries closely follow 
the spatial distribution of stars, with exponential decline 
in the number density as a function of Z. The fraction of 
binaries relative to the number of all stars, shown in the 
bottom right panel, changes by only a factor of 2 over a 
range of 3 kpc, starting from 0.9% at Z = 500 pc and 
declining to 0.5% at Z = 3000 pc. 

4.4. The Limitations of the Kinematic Sample 

It would be informative to repeat the f(a) and abreak 
analysis using kinematically-selected binaries, but un- 
fortunately, the apparent incompleteness of SDSS-POSS 
proper motion data at 9 < 9" prevents us in doing so. 
As shown in Figure I22[ the number of common proper 
motion pairs drops sharply below 8 = 9", probably due 
to blending of close sources in the POSS data. Because 
of this 9 cut-off, for the same range in Z, the effective 
a-min for the kinematic sample is three times that of the 
geometric sample (where the lower limit on 9 is 3"). In 
the case of 0.1 < Z/kpc < 0.4 sample observed here, 
the smallest probed semi-major axis (a m i n ) is at 5079 
AU, well above the abreak value of 4534 AU predicted 
by the abreak oc Z 7 relation. Since we are outside the 
range where OD is valid, we cannot measure where the 
turnover in f(a) happens, and cannot determine abreak 
or Nbin- In all the other Z bins, a m i„ is also above the 
predicted abreak value, and therefore outside the Opik 
regime. 

5. DISCUSSION AND CONCLUSIONS 

We have presented a novel approach to photomet- 
ric parallax estimation based on samples of candidate 
wide binaries selected from the Sloan Digital Sky Sur- 
vey (SDSS) imaging catalog. Our approach uses the fact 
that binary system's components are at equal distances 
and estimates the photometric parallax relation for main- 
sequence stars by minimizing the difference of their dis- 
tance moduli. While this method is similar to constraints 
on photometric parallax relation obtained from globular 
clusters in that it does not require absolute distance es- 
timates, it has the advantage that it extends to redder 
colors than available for globular clusters observed by 
the SDSS, and it implicitly accounts for the metallicity 
effects. 



The derived best-fit photometric parallax relations rep- 
resent metallicity-averaged relations and thus provide an 
independent confirmation of relations proposed by J08 in 
their study of the Galactic structure. An important re- 
sult of this work is our estimate of the expected error dis- 
tribution for absolute magnitudes determined from pho- 
tometric parallax relations (a root-mean-square scatter 
of ^0.3 mag, see Section |33|) . which is in good agree- 
ment with modeling assumptions adopted by J08. The 
mildly non-Gaussian error distribution is consistent with 
both the impact of unresolved binary stars, and the vari- 
ation of photometric parallax relation with metallicity; 
we are unable to disentangle these two effects. 

The best-fit photometric parallax relations enabled the 
selection of high-efficiency samples of disk wide binaries 
with ~ 22, 000 candidates, that include about 14,000 true 
binary systems (efficiency of ~ 2/3). Using the photo- 
metric measurements and angular distance of the two 
components, samples with efficiency exceeding 80% can 
be constructed (see Section l4~Tj) . Such samples could be 
used as a starting point to further increase the selection 
efficiency with the aid of radial velocity measurements. 
Spectral observations of systems where the brighter com- 
ponent is an F/G star, for which it is easy to estimate 
metallicity, could be used to calibrate both spectroscopic 
and photometric methods for estimating metallicity of 
cooler K and M dwarfs. Com pared to the state-of- t he-art 
cata logs of wide binaries by | Chaname fe Gouldl (l200l 
and lL6pine fe Bongiornol (|2007h . the samples discussed 
here represent a significant increase in the number of po- 
tential binaries, and probe larger distances (to ~ 4 kpc). 
To facilitate further studies of wide binaries, we make 
the catalog publicly available 9 . 

Using the high-efficiency subsamples, we analyzed their 
dynamical and physical properties. We find that the 
spatial distribution of wide binaries follows the distri- 
bution of single stars to within a factor of 2, and that 
the probability for a star to be in a wide binary sys- 
tem is independent of its color. However, given this 
color, the companions of red components seem to be 
drawn randomly from the stellar luminosity function, 
while blue components have a larger blue-to-red com- 
panion ratio than expected from luminosity function (see 
Section l4~2l). These results are cons istent with recent re- 
sults bv lLepine fe Bongiornol (|2007l ). and provide strong 
constraints for the scenarios des cribing the formation of 
such systems (e.g., iGie rsz 2006 and referen ces therein; 
lClarkdl2007t iHurlev. Aarseth fc Sharall2007h . 

We also study the semi-major axis distribution of wide 
binaries in the 2, 000-47, 000 AU range (see Section QJ) , 
The observed distribution is well described by the Opik 
distribution, f(a) oc 1/a, for a < abreak, where abreak in- 
creases roughly linearly with the height above the Galac- 
tic plane (abreak ~ 12,300 AU at Z = 1 kpc). Alterna- 
tively, the abreak correlates with the local number density 
of stars as abreak oc p -1 / 4 , but we are unable to robustly 
identify the dominant correlation (Z and p are highly 
correlated). 

The distribution of semi- major axes for wide bina - 
ries was also discussed by IChaname fc Gouldl (|2004l) . 
They used a sample of wide binaries selected us- 

The catalog can be downloaded from 
[T}iREF]http:// www.astro.washington.edu/bsesar/SDSS_widc_binaries.tar.gz 
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ing common proper motion from the rNLTT catalog 
(jGould fc Saliml l2003l h and found f(a) oc 1/a 16 , with 
no evidence of a turnover at a < 3000. Their sam- 
ple extended to larger angular separations than ours, 
and probed smaller distances. On the o ther hand, 
iPoveda. Allen fc Hernandez- Alcantara! (l2007[) used wide 
binaries from the same iChaname fc Gouldl (|2004) sam- 
ple, and detected Opik distribution, f(a) oc 1/a, for a < 
3, 000 , consistent with the result of Chana me fc Gouldl 
(2004). In a recent study, Lepine & Bongiorno searched 
for faint common proper motion companions of Hippar- 
cos stars and detected a turnover from Opik distribution 
to a steeper distribution around a ~ 3,000 AU. Their 
sample also probed much smaller distances than ours. 
We compare these results in Figure [23] As evident, the 
variation of cibreak with distance from the Galactic plane 
detected here (approximately with distance, as shown in 
Figure I23[ since stars in our sample are mostly at high 
galactic latitudes), is consistent with the above results 
that are based on more local samples. In particular, this 
comparison of different studies suggests that the flatten- 
ing of f(a) for small a that "puzzled" Chaname & Gould 
(see their section 4.3) is probably due to a combination 
of selection effects and the approach of the domain where 
Opik distribution is valid in their sample. 

The Opik distribution suggests that the process of 
star formation produces multiple stars, which evolve to- 
wards binaries after ejecting one or more s ingle stars 
(jPoveda. Allen fc Hernandez- Alcantara! [20071 ). The de- 
parture from the Opik distribution may be evidence 
for disruption of wide binaries over long periods of 
time by passing stars, giant molecular clouds, massive 
compact halo objects (MACHOs), or disk and Galac- 
tic ti d es dH cggic 1975; Weinber g Shapiro fc Wassermanl 
[19871 lYoo. Chaname. fc Gouldl I2004D . However, we 
note that the cibreak oc p -1 / 4 correlation (see Fig- 
ure [21]) is outside the expected range discussed by 
lYoo. Chaname. fc Gouldl (|2004h (cibreak oc p~ 2 ^ 3 for close 
strong encounters, and cibreak oc p~ x for weak encoun- 
ters). 

The samples presented here can be further refined 
and enlarged. First, the SDSS covers only a quar- 
ter of the sky. Upco ming next-generati on surveys, 
such as the Skv Mapper (iKeller et al.l|2007ft . the Dark 
Energy Survey flF lmrgherelalJ [20071), Pan-STARRS 
dKaiser et al.ll2002l) and th e Large Synoptic Survey Tele- 
scope ( Ivezic et al.ll2008~bl LSST hereafter), will enable 
the construction of such samples over most of the sky. 
Due to fainter flux limits (especially for the Pan-STARRS 
and LSST), the samples will probe a larger distance range 
and will reach the halo-dominated parts of the Galaxy. 
Furthermore, due to improved photometry and seeing 
(e.g., for the LSST, by about a factor of two), the se- 
lection will be more robust. We scale the 20,000 candi- 



dates discussed here, assuming log(N) = C+0.4r, to the 
LSST depth that enables accurate photometric metallic- 
ity (r < 23; I08a) and predict a minimum sample size of 
^400,000 candidate wide binary systems in 20,000 deg 2 
of sky. It is likely that the sample would include more 
than a million systems due to the increase of the stellar 
counts close to the Galactic plane. 

Another imp ortant developm e nt wi ll come from the 
Gaia mission (jPerrvman et al.l 120011 : IWilkinson et al.l 
2005), which will provide direct trigonometric distances 
for stars with r < 20. With trigonometric distances, 
accurate photometric parallax relation can be used to 
provide strong constraints on the incidence and color 
distribution of unresolved multiple systems. Until then, 
a radial velocity survey of candidate binaries assembled 
here could help with pruning the sample from random 
associations, and with better characterization of various 
selection effects. 
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APPENDIX 
A. SQL QUERIES 

The following SQL queries were used to select initial samples of candidate binaries through the SDSS CasJobs 
interface. When running these queries, the database context must be set to "DR6" or higher. 

select — geometric selection of candidate binaries 

round(pl .ra,6) as ral, round(pl . dec ,6) as deel, round(pl . extinction_r , 3) as rExtl, 



14 



round (pi .psfMag_u,3) as psf_ul, round (pi .psfMag_g,3) as psf_gl, 
round(pl .psfMag_r ,3) as psf_rl, round(pl . psf Mag_i ,3) as psf_il, 
round(pl .psfMag_z,3) as psf_zl, round(pl .psfMagErr_u,3) as psfErr_ul, 
round (pi .psfMagErr_g,3) as psfErr_gl, round (pi .psf MagErr_r ,3) as psfErr_rl, 
round(pl .psfMagErr_i,3) as psfErr_il, round(pl .psfMagErr_z,3) as psfErr_zl, 
pl.objid as objidl, 

round (p2 . ra, 6) as ra2, round (p2 . dec , 6) as dec2, round (p2 . extinction_r , 3) as rExt2, 
round (p2 .psfMag_u,3) as psf_u2, round(p2.psfMag_g,3) as psf_g2, 
round (p2 .psfMag_r ,3) as psf_r2, round(p2.psfMag_i,3) as psf_i2, 
round(p2.psfMag_z,3) as psf_z2, round(p2.psfMagErr_u,3) as psfErr_u2, 
round (p2 .psfMagErr_g,3) as psfErr_g2, round (p2 .psf MagErr_r ,3) as psfErr_r2, 
round(p2.psfMagErr_i,3) as psfErr_i2, round(p2.psfMagErr_z,3) as psfErr_z2, 
p2.objid as objid2, 

round(NN.distance*60,3) as theta 

into mydb.binaryClose 

from Neighbors as NN join star as pi on pl.objid = NN.objid 
join star as p2 on p2.objid = NN.neighborobjid 
where NN.mode = 1 and NN.neighbormode = 1 
and NN.type = 6 and NN.neighbortype = 6 

and pl.psfMag_r between 14 and 20.5 

and (pl.flags_g k '229802225959076') = and (pl.flags_r k '229802225959076') = 
and (pl.flags_i k '229802225959076') = and (pl.flags_g k '268435456') > 
and (pl.flags_r k '268435456') > and (pl.flags_i & '268435456') > 

and p2.psfMag_r between 14 and 20.5 

and (p2.flags_g k '229802225959076') = and (p2.flags_r k '229802225959076') = 
and (p2.flags_i k '229802225959076') = and (p2.flags_g k '268435456') > 
and (p2.flags_r k '268435456') > and (p2.flags_i & '268435456') > 

and (pi .psfMag_r-pl . extinction_r) < (p2 .psf Mag_r-p2 . extinction_r) 
and (pi .psfMag_g-pl . extinction_g - pi .psf Mag_i+pl . extinction_i) < 
(p2.psfMag_g-p2.extinction_g - p2 .psfMag_i+p2 . extinction_i) 

and NN.distance*60 between 3 and 4 

select — kinematic selection of candidate binaries 

round (pi . ra, 6) as ral, round (pi . dec , 6) as decl, round (pi . extinction_r , 3) as extl, 

round(pl .psfMag_u,3) as ul, round(pl .psfMag_g,3) as gl, 

round (pi .psfMag_r ,3) as rl, round (pi .psfMag_i,3) as il, 

round(pl .psfMag_z,3) as zl, round(pl .psfMagErr_u,3) as uErrl, 

round(pl .psfMagErr_g,3) as gErrl, round(pl .psfMagErr_r ,3) as rErrl, 

round (pi .psfMagErr_i ,3) as iErrl, round (pi .psfMagErr_z,3) as zErrl, 

(case when ((pi. flags k '16') = 0) then 1 else end) as IS0LATED1 , 

NN.objid as objidl, 

round (p2 . ra, 6) as ra2, round (p2 . dec , 6) as dec2, round (p2 . extinction_r , 3) as ext2, 
round(p2 .psfMag_u,3) as u2, round(p2.psfMag_g,3) as g2, 
round (p2 .psfMag_r ,3) as r2, round(p2.psfMag_i,3) as i2, 
round(p2.psfMag_z,3) as z2, round (p2 .psfMagErr_u,3) as uErr2, 
round (p2 .psfMagErr_g,3) as gErr2, round(p2.psfMagErr_r,3) as rErr2, 
round (p2 .psfMagErr_i ,3) as iErr2, round(p2.psfMagErr_z,3) as zErr2, 
(case when ((p2. flags k '16') = 0) then 1 else end) as IS0LATED2, 
NN.neighborobjid as objid2, 

round(NN.distance*60,3) as theta, 

round(sl .pmL,3) as pmLl, round(sl .pmB,3) as pmBl, 

round(s2 .pmL,3) as pmL2, round(s2.pmB,3) as pmB2 
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into mydb.binaryPM 

from Neighbors as NN join star as pi on pl.objid = NN.objid 

join star as p2 on p2.objid = NN.neighborobjid 

join propermotions as si on sl.objid = NN.objid 

join propermotions as s2 on s2.objid = NN.neighborobjid 

where NN.mode = 1 and NN.neighbormode = 1 
and NN.type = 6 and NN.neighbortype = 6 

and pl.psfMag_r between 14 and 19.5 

and (pl.flags_g & '229802225959076') = and (pl.flags_r k '229802225959076') = 
and (pl.flags_i k '229802225959076') = and (pl.flags_g k '268435456') > 
and (pl.flags_r k '268435456') > and (pl.flags_i & '268435456') > 

and p2.psfMag_r between 14 and 19.5 

and (p2.flags_g k '229802225959076') = and (p2.flags_r k '229802225959076') = 
and (p2.flags_i & '229802225959076') = and (p2.flags_g k '268435456') > 
and (p2.flags_r k '268435456') > and (p2.flags_i & '268435456') > 

and (pi .psfMag_r-pl . extinction_r) < (p2 .psf Mag_r-p2 . extinction_r) 
and (pi .psfMag_g-pl . extinction_g - pi .psfMag_i+pl . extinction_i)< 
(p2.psfMag_g-p2.extinction_g - p2 .psf Mag_i+p2 . extinction_i) 

and si. match = 1 and s2. match = 1 

and sl.sigra < 350 and sl.sigdec < 350 

and s2.sigra < 350 and s2.sigdec < 350 

and sqrt (power (si .pmL - s2.pmL,2) + power(sl.pmB - s2.pmB,2)) < 5 
and (case when sqrt (power (si .pmL , 2) + power (si .pmB ,2) ) > 
sqrt (power (s2.pmL, 2) + power(s2.pmB,2)) then 
sqrt (power (si .pmL , 2) + power (si .pmB ,2) ) else 

sqrt (power (s2. pmL, 2) + power (s2. pmB, 2)) end) between 15 and 400 



Recent analysis of metallicity and kinematics for halo and disk stars by I08a provides sufficient information to 
understand the behavior of the reduced proper motion diagram in quantitative detail (including both the sequence 
separation and their widths), and to demonstrate that its efficiency for separating halo and disk stars deteriorates at 
distances beyond a few kpc from the Galactic plane. As Equation [8] shows, for a population of stars with the same 
v t , the reduced proper motion is a measure of their absolute magnitude. For two stars with the same color that is 
sensitive to the effective temperature (such as the g — i color), but with different metallicities and tangential velocities, 
the difference in their reduced proper motions is 



where H and D denote the two stars. In the limit that the shape of the photometric parallax relation does not depend 
on metallicity, AM r does not depend on color, and is fully determined by the metallicity difference of the two stars 
(or populations of stars) . Using metallicity distributions for disk and halo stars obtained by I08a, and their expression 
for AM r ( [Fe/H } ) (Equation A2), we find that the expected offset between M r for halo and disk stars with the same 
g — i color varies from 0.6 mag for stars at 1 kpc from the Galactic plane to 0.7 mag at 5 kpc from the plane, where 
the variation is due to the vertical metallicity gradient for disk stars. The finite width of halo and disk metallicity 
distributions induces a spread of M r (root-mean-square scatter computed using interquartile range) of 0.15 mag for 
disk stars and 0.18 mag for halo stars. 

The effect of metallicity on the separation of halo and disk sequences in the reduced proper motion diagram is smaller 
than the effect of different tangential velocity distributions. Assuming for simplicity that stars are observed towards 
a Galactic pole, and that the median heliocentric tangential velocities are 30 km s _1 for disk stars and 200 km s" 1 
for halo stars, the induced separation of their reduced proper motion sequences is ~4.1 mag (the expected scatter in 
the reduced proper motion due to finite velocity dispersion is ~1-1.5 mag). Together with the ~ 0.7 mag offset due 
to different metallicity distributions, the separation of ^5 mag between the two sequences makes the reduced proper 
motion diagram a promising tool for separating disk and halo stars. 

However, the reduced proper motion diagram is an efficient tool only for stars within 1-2 kpc from the Galactic 
plane. The main reason for this limitation is the decrease of rotational velocity for disk stars with distance from the 
Galactic plane, with a gradient of about —30 km s" 1 kpc" 1 (see Section 3.4.2 in I08a). As the difference in rotational 
velocity between halo and disk stars diminishes with the distance from the plane, the separation of their reduced 
proper motion sequences decreases, too. A mild increase in the velocity dispersion of disk stars, as well as a decrease 
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of their median metallicity with the distance from the plane, also decrease the sequence separation, but the dominant 
cause is the rotational velocity gradient. 

To illustrate this effect, we select a sample of ~60,000 stars with 14 < r < 20 and 0.2 < g — r < 0.4, that are 
observed towards the north Galactic pole (6 > 70°). In this color range it is possible to separate disk and halo stars 
using photometric metallicity estimator from I08a, and we further select a sample of ~16,000 likely disk stars with 
[Fe/H] > —0.9, and a sample of ~34,400 likely halo stars with [Fe/H] < —1.1 (see Figure 9 in I08a for justification). 
Their proper motion distributions as functions of distance from the Galactic plane, Z, are shown in the top left panel 
in Figure [24] Because of the gradient in the rotational velocity for disk stars, their median proper motion becomes 
constant at ^8 mas yr _1 beyond Z ~ 2 kpc, while the median proper motion for halo stars is roughly proportional to 
1/Z, with a value of ~11 mas yr^ 1 at Z = 5 kpc. 

The top right panel in Figure [24] shows the positions and widths of the reduced proper motion sequences for disk and 
halo stars as functions of Z, and the two bottom panels show the sequence cross-sections for stars with Z = 1 — 1.5 
kpc and Z = 3.5 — 4 kpc. At distances beyond ^2 kpc from the plane, the reduced proper motion diagram ceases to 
be an efficient tool for separating halo and disk stars because the two sequences start to significantly overlap. This 
increasing overlap is a result of the rotational velocity gradient for disk stars, and the finite width of halo and disk 
velocity distributions, and would be present even fo r infinitely accurate measurements (with the proper motion errors 
of ^3 mas yr" 1 per coordinate. iMunn et alJ 12004 the sequence widths of ~1.0-1.5 mag are dominated by velocity 
dispersions). Hence, beyond ~2 kpc from the plane, metallicity measurements are necessary to reliably separate disk 
and halo populations. 

The above analysis is strictly valid only for fields towards the north Galactic pole. ISalim fe Gouldl (|2003l ) found that 
the position of disk and halo reduced proper motion sequences, relative to their positions at the north Galactic pole, 
varies with galactic latitude as 

Ar RPM (b) = 5log(v t /v? GP ) = -1.43 (1 - sin(|6|)) , (B2) 

where v? GP is the median value of Vt for stars observed towards the north Galactic pole. This result is a bit unexpected 
because it does not contain longitudinal variation due to projection effects of the rotational motion of the local standard 
of rest. We show the variation of Atrpm, for stars with 0.2 < g — r < 0.4, as a function of galactic coordinates in 
Figure[25] We use photometric metallicity to separate stars into disk and halo populations. As figure demonstrates, the 
longitudinal dependence is present for halo sample, but not for disk samples. We have generated simulated behavior 
of ArRPM using kinematic model from I08a, and reproduced the observed behavior to within the measurement noise. 
It turns out that the vertical gradient of rotational velocity for disk stars is fully responsible for the observed strong 
de pendence of Atrpm on latitude, and which masks the dependence on longitude. Hence, the sin(|6|) term proposed 
bv ISalim fe Gould (2003) is an indirect discovery of the vertical gradient of rotational velocity for disk stars! These 



empirical models also show that a linear dependence of ArppM(b) on sin(|6|) is only approximately correct, and that 
it ignores the dependence on distance. While a more involved best-fit expression is possible (full two-dimensional 
consideration of proper motion also helps to better separate disk and halo stars), we find that halo stars can always 
be efficiently rejected at |6| > 30°, if the separator shown in Figure [5] is shifted upwards by 1 mag. 

C. THE MODELING OF UNRESOLVED BINARIES IN THE SAMPLES OF WIDE BINARIES 

One major uncertainty when using a photometric parallax relation is the lack of information whether the observed 
"star" is a single star, or a binary (multiple) system. If the observed "star" is a binary system, its luminosity will 
be underestimated, with the magnitude of the offset depending on the actual composition of the binary. To model 
this offset, or to correct for it, one would ideally like to have a probability density map that gives the probability of a 
magnitude offset, AM r , as a function of the observed binary system's color. 

To construct such a map, we have generated a sam ple of 100,000 unresolved binary systems by randomly pairing 
stars drawn from the iKroupa. Tout fe Gilmorel (|1990f l luminosity function. By independently drawing the luminosities 
of each component to generate unresolved binary systems, we implicitly assume that the formation of each com ponent 
is un affected by the presence of the other. While there are other proposed mechanisms for binary formation (]Clarkd 
120071 and references therein), we have chosen this one because it was easy to implement. 

For every unresolved binary system we calculate the total r band luminosity, and the r — i and g — i color of the 
system. The magnitude offset, AM r , caused by unresolved binarity, is obtained as the difference between the true r 
band absolute magnitude, and the absolute magnitude for the pair's joint r — i color calculated using Equation 1151 The 
probability density map is then simply the number of unresolved binary systems (normalized with the total number 
of systems at a given color) as a function of AM r and pair's joint g — i color, shown in Figure [26] 

It is worth noting that, with the adopted binary formation mechanism, the magnitude offset is the smallest (AM r < 
0.1 mag) for the bluest stars, and greatest (AM r > 0.7 mag) for the reddest stars. Because of this, the scatter due to 
unresolved binarity in the 8 distribution should be more pronounced in a sample of red stars (g — i > 2.0), than in a 
sample of blue stars. 

The map shown in Figure [26] can be parametrized as a Gaussian distribution P(AM r \[i, a), where 

(jt = 0.037 + 0.10(5 + 0.09(5 - if - 0.012(5 - if (CI) 

a = 0.041 + 0.03(5 - «) + 0.15(5 - if - 0.057( 3 - if (C2) 



is the median AM r , and 
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is the scatter (determined from the interquartile range). To verify the validity of this parametrization, we subtract 
AM r and fi, normalize the difference with a, find the distribution of such values, and fit a Gaussian to it. As shown 
in Figure [571 the distribution of normalized residuals is well described by a Gaussian with a = 0.9. The two peaks 
in the distribution are due to highly asymmetric distributions of AM r values around the median AM r for the bluest 
(g — i ~ 0.1) and reddest (g — i ~ 2.9) systems. 

To create a sample of wide binaries where some of the stars are unresolved binary systems, first we select pairs with 
20" < 8 < 30" from the initial sample of stellar pairs. Following the procedure described in Section I3T21 we create the 
"true" wide binaries by changing the T2 magnitude using Equation llll and add 0.15 mag of Gaussian noise to simulate 
the scatter in the photometric parallax due to photometric errors. A fraction of stars is then randomly converted to 
unresolved binary systems by subtracting a AM r value from the r band (apparent) magnitude, where the AM r is 
drawn from a g — i color-dependent P(AM r \fi, a) distribution. 

Figure [28] shows the 5 distribution for such a mock sample, where the components are redder than g — i = 2.0 
and have a 40% probability to be unresolved binary systems. Different configurations of single stars and unresolved 
binaries that contribute to the observed S distribution can be easily identified. Wide binaries where both components 
are single stars contribute the central narrow Gaussian, with its width due to photometric errors. If the brighter 
component is an unresolved binary system, its absolute magnitude is underestimated, and the result is an offset in S 
in the negative direction. A similar outcome happens if the fainter component is an unresolved binary system, but the 
offset is positive. Single star-unresolved binary configurations, therefore, contribute the left and the right Gaussians. 
If both components are unresolved binary systems, the S will be centered on zero and will be (Jq\/2 wide, where <jq is 
the width of the (AM r — /i) distribution. This behavior is consistent with the S distributions observed in Figure [T2l 
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TABLE 1 

The centers, widths, and areas for best-fit Gaussian distributions 





Geometrically-selected sample 


Kincmatically-sclcctcd sample 




Narrow Gaussian Wide Gaussian 


Narrow Gaussian Wide Gaussian 


Center 
Width 
Area a 


-0.01 -0.03 
0.12 0.54 
0.26 0.74 


-0.05 0.01 
0.11 0.51 
0.34 0.66 



a Areas of the narrow and wide Gaussians sum to 1 
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TABLE 2 

The conditional probability density 
functions 

P[(9-i)B\(9-i)A] = a + b(g - i) + c(g - i) 2 



Best-fit parameters 



(9- 


i)A bin 


a 


b 


c 


0.4 < (g 


- i) A < 0.8 


0.38 








0.8 < (g 


-i) A < 1.2 


0.46 








1.2 < (g 


- i) A < 1-6 


0.37 








1.6 < (g 


-i) A <2.0 


0.37 








2.0 < (g 


- j) A < 2.4 


0.08 


0.14 


0.04 


2.4 < (g 


- j)a < 2.8 


0.23 


-0.50 


0.38 
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Fig. 1. — Top: A comparison of observed (fobs, solid histogram) and random (f r nd, dashed line, see text) distributions of angular 
separation 8. Middle: Ratio fobal fmd 3 s a function of angular separation 8. Bottom: Fraction of true binary systems, e, as a function of 
angular separation 8. 
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Fig. 2. — Distribution of counts for the geometrically-selected candidate sample (top), random sample (middle), and the ratio of two 
maps (bottom) in the Ar = f2 — r\ vs. A(g — i) = (g — 4)2 — (9 — *)i diagram, binned in 0.05 X 0.1 mag bins. The average candidate-to-random 
ratio in the region outlined by the dashed lines (Eq. [3] and |4j is ~ 1.7, implying that > 40% of candidates are true binaries. 
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Fig. 3. — The r vs. g — i distribution of brighter (top) and fainter (bottom) components from the geometrically-selected sample of 
candidate binaries, shown with linearly spaced contours. 
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log 10 (stars/kpc 3 /mag), NGP (b>70°) 
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Fig. 4. — The color-coded map, with the legend shown in the top right corner, shows the logarithm of the volume number density 
(stars/kpc 3 /mag) of ~2.8 million stars with 14 < r < 21.5 observed towards the north Galactic pole (ft > 70°), as a function of their 
distance modulus and the g — i color (the density variation in the horizontal direction represents luminosity function, and the variation 
in the vertical direction reflects the spatial volume density profiles of disk and halo stars). The absolute magnitudes are computed using 
expressions A3 and A7 from I08a, and the displayed distance range is 100 pc to 25 kpc. Stars are color-selected from the main stellar locus 
(dominated by main-sequence stars) using criteria 3-5 from Section 2.3.1 in I08a. The metallicity correction is applied using photometric 
metallicity for stars with g — i < 0.7 (based on Eq. 4 from I08a), an d by assuming \Fe/ H] = —0.6 for redder stars. As illustrated above 
the g — i axis using the MK spectral type vs. g — i color table from Covey ct al. (2007), this color roughly corresponds to G5. The two 
vertical arrows mark the turn-off color for disk stars, and the red edge of M dwarf color distribution (there are redder M dwarfs detected 
by SDSS, but their volume number density, i.e., the luminosity function, falls precipitously beyond this limit; J. Bochanski, priv. comm.). 
The two diagonal dashed lines show the apparent magnitude limits, r = 14 and r = 21.5. The dot-dashed diagonal line corre sponds to 
r = 20, which approximately describes the 50% completeness limit for stars with cataloged proper motions (Munn ct al. 2004). Around 
the marked distance range of 3-4 kpc, the counts of halo stars begin to dominate disk stars (see Fig. 6 in I08a), and the distance range 
around 1 kpc offers the largest color completeness. 
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Fig. 5. — The reduced proper motion diagrams for two subsamples of stars shown in Fig. [4] The color-coded maps show the logarithm of 
the number of stars per pixel, according to the legends. The left panel corresponds to a sample of ~446,000 stars with proper motions in 
the range 15-50 mas/yr, and the right panel to a sample of 43,000 stars from the range 50-400 mas/yr. The requirement of larger proper 
motions introduces bias towards closer, and thus redder stars. Two two long-dashed lines in each panel correspond to photometric parallax 
relation from I08a, evaluated for [Fe/H] = —0.6 and with tangential velocity of 55 km/s (top curve) and 120 km/s (bottom curve). This 
variation of tangential velocity is consistent with the rotational velocity gradient discussed by I08a. The dot-dashed line is evaluated for 
[Fe/H] = —1.5 and with tangential velocity of 300 km/s. The short-dashed line (second from the bottom) separates disk and halo stars, 
and is evaluated for [Fe/H] = —1.5 and with tangential velocity of 180 km/s. 
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, fraction of stars with |^t|= 15-400 mas/yr 



disk stars (rpm, |yn|= 1 5 — 400 mas/yr) 




Fig. 6. — Analogous to Fig. [4] for subsamples selected using proper motion measurements. Out of 2.8 million stars shown in Fig. l4l 
1.24 million are brighter than r = 19.5 and have proper motion measurements. Of those, 498,000 have proper motion in the range 15-400 
mas/yr (only 10% of selected stars have proper motions greater than 50 mas/yr). The color-coded map in the top left panel shows the 
fraction of such stars, as a function of distance and the g — i color. At a distance of ~1 kpc, about half of all stars have proper motion 
larger than 15 mas/yr. The top right panel shows the counts of candidate disk stars, selected as stars above the separator shown in Fig. [5] 
and the bottom left panel shows halo stars selected from below the separator. The bottom right panel shows the counts of halo stars, as a 
fraction of all stars selected using the reduced proper motion diagram. Note that beyond 3 kpc, the sample is dominated by halo stars. 
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Fig. 7. — Top: The photometric metallicity vs. distance from the Galactic plane diagram for candidate binaries selected from the geo metri c 
sample using |5| < 0.4 and 0.2 < (g — r)i < 0.4. The |<5| < 0.4 cut is used to reduce the contamination by random pairs (see Section |3.6> . 
Note that the fraction of low-metallicity halo binaries ([Fe/H] < —1) becomes significant only at Z > 2 kpc. Middle: Analogous to the top 
panel, except that binaries from the kinematic sample are shown. Dots correspond to binaries with reduced proper motions characteristic 
of disk binaries, and triangles to candidate halo binaries. Note that binaries with disk-like metallicity ([Fe/H] > —1) at large distances 
(Z > 2 kpc) are misclassified as halo binaries. Bottom: The comparison of the (u — g)i color distributions, and corresponding photometric 
metallicity distributions, for binaries from the top two panels. The metallicity vs. u — g color transformation is taken from I08a. The 
distribution for binaries from the geometric sample is shown by the thick solid line, and the distributions for binaries from the kinematic 
sample are shown by the thin solid line (disk candidates) and dotted line (halo candidates). 
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Fig. 8. — Distribution of 5 = (M r 2 — M r \) — (r2 — n) values for a mock sample of candidate binaries (solid line) when M r = M r (r — £|po) 
(top), and for a M r different from M r (r — «|po) (bottom). The fraction of random pairs (the contamination) in the sample is 80%. The 
8 distribution for "true" binaries (dots) is obtained by subtracting the <5 distribution of random pairs (open circles) from the candidate 
binary S distribution. The best-fit Gaussian for the "true" binaries 8 distribution is centered on and 0.1 mag wide in the top panel, and 
centered on -0.02 and 0.13 mag wide in the bottom panel. 
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Fig. 9. — The dependence of median <5, (<5), values on r-i colors of the brighter and fainter component for the geometrically- (top) and 
kinematically-selected (bottom) samples of candidate binaries with |<5| < 0.4. The r — i color axes are interpolated from g — i axes using 
Eg. 1101 Sources are binned in 0.1 X 0.1 mag g — i color pixels (minimum of 6 sources per pixel), and the median values are color-coded 
according to the legends given at the top of each panel. Inset histograms show the distribution of (8). The (5) distribution medians are 
to within < 0.01 mag, and the scatter (determined from the interquartile range) is 0.07 mag for both samples. 
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Fig. 10. — C omparison of Eg. 1141 ( dot-dashed line) and Eg. 1151 (dashed line) photometric parallax relations from J08 (their Eqs. 1 and 
2) with Eg. 1121 (solid line) and Eg. 1131 (dotted line) photometric parallax relations determined in this work. The inset shows the magnitude 
difference, A = MjQg — Mggg, bet ween th e Eg . 1151 pho tom etric parallax relation, and Eqs. 1121 (solid line ) and |13l (dotted line) from this 
work. The rms scatter between Eos. I12l and ll3l and Eg. 1151 is 0.13 mag. The rms scatter between Egs. 1121 and 1131 (dashed line) is also 0.13 
mag. 
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Fig. 11. — A comparison of (g — 1)2 vs. (g — i)i color-color distributions of geometrically-selected (top) and kinematically-selected disk 
binaries (bottom) with [<J| < 0.4. The fraction of binaries in a pixel is color-coded ac cording to legends. The pixels are 0.2 X 0.2 mag wide 
in g — i color, and the r — i color axes are interpolated from g — i axes using Eq. 1101 
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Fig. 12. — Distribution of <5 values for the ge ome tric ally - (top) and kinematically-selected (bottom) samples of candidate binaries, with 
absolute magnitudes, M r , calculated using Eqs, I12l and ll3l respectively. The 5 distribution for true binaries (open circles) is obtained by 
subtracting the S distribution of random pairs (triangles) from the <5 distribution for candidate binaries (thick solid line). The <5 distribution 
for true binaries is a non-Gaussian distribution (dashed line), that can be described as a sum of two Gaussian distributions. The centers, 
widths, and areas for the best-fit narrow (dotted line) and wide (thin solid line) Gaussian distributions are given in Table [T] The integrals 
(areas) of 8 distributions for random pairs and candidate binaries are A ranl i om and ^t D f>serued> respectively. 
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Fig. 13. — Distribution of 8 values normalized by the expected formal errors, erg, for the kinematically-selected sample of candidate 
binaries. The S/crg distribution for true binaries (open circles) is obtained by subtracting the S/crg distribution of random pairs (triangles) 
from the S/crg distribution for candidate binaries (thick solid line). The S/ag distribution for true binaries is a non-Gaussian distribution 
(dashed line), that can be described as a sum of two Gaussian distributions. The best-fit narrow Gaussian (dotted line) is 0.75 wide and 
centered on -0.10, while the best-fit wide Gaussian (thin solid line) is 4.04 wide and centered on -0.14. 
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Fig. 14. — Dependence of median PSF magnitude errors on magnitude for the brighter (left) and fainter (right) components in the 
geometrically- (dots) and kinematically-selected (triangles) samples of candidate binaries. The vertical bars show the rms scatter in each bin 
(not the error of the median which is much smaller) . The fainter components of geometrically-selected candidate binaries have overestimated 
median PSF magnitude errors when compared to the kinematically-selected binaries. 
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Fig. 15. — The fraction of \S\ < 0.4 binaries in 0.7 < d/kpc < 1.0 volume-complete geometrically-selected (top) and random (middle) 
samples that have (g — i)\ and (g — 1)2 as the colors of the brighter and fainter component. The pixels are 0.2 X 0.2 mag wide in g — i 
color, and the r — i color axes are interpolated from g — i axes using Eq. 1101 The pixels in maps sum to 1. The bottom plot shows the 
difference, f ca nd[(9 — (g — 1)2] — C * f ra nd[(g ~ (9 ~ O2], between the two maps, where C = 0.14 is the fraction of random pairs 
estimated using Eg. I21l for the |5| < 0.4, 0.7 < d/kpc < 1.0 geometrically-selected sample. The pixels with negative values are not shown 
and the map is renormalized so that the pixels sum to 1. 
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Fig. 16. — Conditional probability density of having one component with (g — i)b color in a wide binary system where the other 
component has (g — i)a- The conditional probability density for (g — i)a < 2.0 (top and middle) is independent of (g — i)b, while for 
(fl — > 2.0 (bottom) it changes as a square of (g — i)g. The best-fit functions describing these conditional probabilities are given in 
Table [2 
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Fig. 17. — Top: A comparison of g — i color distribution of stars in the |<5| < 0.4, 0.7 < d/kpc < 1.0 volume-complete, geometrically- 
selected, wide binary sample (solid line), and of all stars in the same volume (dashed line). The distributions are normalized to an area of 
1, and the error bars show the Poisson noise. Bottom: The probability density for finding a star with g — i color in a wide binary system, 
~ = P-widebinary, calculated as a ratio of the two distributions from the top panel, and renormalizcd to an area of 1. The equal 
probability distribution is shown as the dashed line. 
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Fig. 18. — Top left: The cumulative distribution of log(a) for geometrically-selected candidate binaries with |<5| < 0.2 and 0.7 < Z/kpc < 
1.0, where a is the average semi-major axis. Top right: The differential distribution of angular separation, 9, for geometrically-selected 
candidate binaries with |<5| < 0.2 and 0.7 < Z/kpc < 1.0. The distribution of random pairs (dashed line) is obtained by fitting a linear 
function f rn d(9) = C 9 to the observed histogram for 9 > 18". 9 max is defined as the angular separation for which the fraction of 
true binaries falls below ~ 5%. Middle left: The fraction of true binaries, e (solid line), calculated from the 9 distribution using Eq. [2] 
(see Section 12.31 1 for the 0.7 < Z/kpc < 1.0 sample, is modeled as a second-degree polynomial, e(9) (d ashe d line). For three 9— selected 
subsamplcs (4" — 5", 5" — 6", and 7" — 8"), the fraction of true binaries was also calculated using Eg. 1211 (i.e.. from the <5 distribution) 
and is shown with symbols. Middle right: The box (dashed lines) shows the allowed range in a defined by Z m i n , Z max , and 9 max (see 
Eqs. 1231 and 124ft . Only binaries within this a range are considered when plotting the corrected cumulative distribution of log(a). Bottom 
left: The cumulative distribution of log(a) for candidate binaries with |<5| < 0.2 and 0.7 < Z/kpc < 1.0 (dashed line), corrected using e(9) 
to account for the decreasing fraction of true binaries at large 9 oc a/d separations. The vertical lines show log(a) for which the straight 
line fit (dot-dashed line) to the cumulative distribution deviates by more than 1.0% (log(a; otl) )), 1.5% (log(a;, reo k)), and 2.0% (log^^g^)). 
Bottom right: The corrected cumulative distribution of log(a) for mock candidate binaries created using the f(a) oc a -0 8 distribution 
limited to a 1 = 100 AU and a 2 = 10000 AU. 
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Fig. 19. — Similar to Fig. 1181 (bottom) plot, but for different Z (height above the Galactic plane) bins ranging from 0.1 < Z/kpc < 0.4 
(top left) to 2.6 < Z/kpc < 3.6 (bottom right). The sampled range of average semi-major axes and angular separations is given for each 
panel. In the 0.1 < Z/kpc < 0.4 bin (top left), the upper limit on log(af, retl k) is 3.50. 
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Fig. 20. — The fraction of true binaries (e) in the 0.1 < Z/kpc < 0.4, |<5| < 0.2 geometrically-selected sample as a function of angular 
separation. The fraction goes below ~ 5% at 9 max = 16", and puts the upper limit on probed semi-major axes to ~ 3,200 AU. 
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Fig. 21. — Top ie/J: The dependence of log(aj, rea fc) values (c.f. Fig. 1 1 Q 6 on log(Z) (dots) is modeled as log(aj, rea fc) = A; log(Z) + I, 
where fc = 0.72 ± 0.05 and I = 1.93 ± 0.15, or approximately, a break [AU] = 12, 302 Z[kpc] - 72 . The symbol size shows the range between 
log(a; om ) and log(af lig i l ). The arrow indicates that the log(a(, reo fe) in the 0.1 < Z/kpc < 0.4 bin (log(Z) ~ 2.4) is an upper limit. Top 
right: The dependence of log(a break ) on log(p), where p is the local number density of stars, is modeled as log(ai, reafe ) = k log(p) + / , where 
k = —0.24 ± 0.02 and I = 3.35 ± 0.07, or aj, rea /c oc p~ 1/ ' 4 . Bottom left: The dependence of local number density, ln(p), of binaries (dots) 
and stars (circles) on the height above the Galactic plane, where the density of stars is normalized to match the density of binaries at 1 
kpc. Bottom right: The fraction of binaries relative to the total number of stars as a function of the height above the Galactic plane. The 
arrow shows the predicted fraction of binaries in the 0.1 < Z/kpc < 0.4 bin, if the a^eafc value follows the a br&ak oc Z o r2 relation. 
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Fig. 22. — The distribution of angular separation for the 0.1 < Z/kpc < 0.4, |<5| < 0.2 kinematically-selected sample of candidate binaries. 
The data (solid line) extend to 9 = 500", though the plotted range is restricted for clarity. The distribution of random pairs (dashed line) 
was obtained by fitting f rn d(9) = C 9 to the observed histogram for 9 > 200". The sharp drop-off in the observed distribution for 9 < 9 is 
probably due to blending of close pairs in the POSS data. 
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Fig. 23. — A comparison of results for the turnover in the distribution of semi-major axes, a^ rfLa ^, as a function of distance modulus, 
of wide binary systems determined here (symbols with error bars; the horizontal bars mark the range of probed semi-major axes, and the 
vertical bars mark the width of the distance bins; the lowest point is only a lower limit, for the sake of comparison with other results we 
ignore the difference between distance from us and distance from the Galactic plane because our sample is dominated by high galactic 
latitude stars), determined by Lepine & Bongiorno (2007; the dashed rectangle indicates constraint on af, re , a j. and the probed distance 
range), and determined by Chaname & Gould (2004; big arrows, indicating upper limits on ai, rea fc and the probed distance range; the 
point at larger distance modulus corresponds to halo binaries). The diagonal dashed lines are lines of constant angular scale, 8, for values 
of 3", 4", 5", 10", 20" and 30" (from left to right). 
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Fig. 24. — The top left panel shows the proper motion distribution as a function of distance from the Galactic plane (Z) for a sample of 
~16,000 likely disk stars (red dots) and a sample of ~34,400 likely halo stars (blue dots). All stars have 14 < r < 20 and 0.2 < g — r < 0.4, 
and are separated using photometric metallicity. The triangles show the median values in 500 pc wide Z bins for each sample (lower 
symbols: disk, upper symbols: halo). Note that the median proper motion for disk stars becomes constant beyond Z ~ 2 kpc due to the 
vertical gradient of rotational velocity for disk stars. The top right panel shows the median position (symbols) and widths (lines; ±1<t 
envelope around the medians) of the reduced proper motion sequences for disk (red dots and dashed lines) and halo (blue squares and 
dot-dashed lines) stars, as functions of Z. The two bottom panels show the cross-sections of the reduced proper motion sequences for stars 
with Z = 1 — 1.5 kpc (bottom left; red histogram for disk stars and blue for halo stars) and Z = 3.5 — 4 kpc. The histograms are normalized 
by the total number of stars in each subsample. The disk-to-halo star count ratio is 4.3 in the left panel, and 0.38 in the right panel. Note 
the significant overlap of the two sequences for large Z. 
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Fig. 25. — An illustration of the offsets in the position of reduced proper motion sequences as a function of distance, position on the sky 
and population. Each panel shows the median value of 5 log(vt /v^ GP ), where vt is the heliocentric tangential velocity, and v^ GP is its 
value at the north Galactic pole, in Lambert projection of northern galactic hemisphere. The maps are color-coded according to the legend 
shown in the middle of the figure (magnitudes), and are constructed using stars with 0.2 < g — r < 0.4. Stars are separated into halo and 
disk populations using photometric metallicity (for details sec I08a). The top left panel shows results for halo stars with distances in the 
3-4 kpc range. The other three panels correspond to disk stars in the distance range 3-4 kpc, 2-2.5 kpc, and 1-1.5 kpc. 
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Fig. 26. — The number of unresolved binary systems (normalized with the total count in a given g — i bin) with a magnitude offset 
AM r = M r (assumed) — M r (true) as a function of the system's g — i color. The assumed absolute magnitude for a system with a g — i 
color, M r (assumed), was calculated using Eg. 1151 fEq. 1 from J08), while the true absolute magnitude, M r (true) was calculated by adding 
up luminosities of components. The mean, median, and the rms scatter of AM r are shown with the dotted, solid, and dashed lines, 
respectively. 
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Fig. 27. — Distribution of differences between the magnitude offset, AM r , and the median magnitude offset, /n, normalized with rms 
scatter, a, (solid line) can be modeled as a 0.9 wide Gaussian (dotted line). 
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Fig. 28. — The distribution of S values for the mock sample of wide binaries with both components redder than g — i = 2.0 (open circles). 
In this sample, a star has a 40% probability to be an unresolved binary system. Single star-single star configurations contribute the central 
narrow Gaussian (dotted line), unresolved binary-unresolved binary configurations contribute the central wide Gaussian (thin solid line), 
while the single star-unresolved binary configurations contribute the left and the right Gaussians (dot-dashed lines). The centers, widths, 
and areas of Gaussians are: JVi (0.00, 0.15, 0.34), N 2 (0.06, 0.35, 0.28), 7V 3 (-0.64, 0.18, 0.18), ^(0.71, 0.17, 0.19) for the narrow, wide, left, 
and right Gaussians, respectively. 



